Global ETD Search

141	Modelos de sobreviv?ncia com fra??o de cura e omiss?o nas covari?veis Fonseca, Renata Santana 06 March 2009 (has links) Made available in DSpace on 2014-12-17T15:26:37Z (GMT). No. of bitstreams: 1 RenataSF.pdf: 443214 bytes, checksum: 93598adf420b7d48eb5b8b2c6e619c38 (MD5) Previous issue date: 2009-03-06 / Coordena??o de Aperfei?oamento de Pessoal de N?vel Superior / In this work we study the survival cure rate model proposed by Yakovlev (1993) that are considered in a competing risk setting. Covariates are introduced for modeling the cure rate and we allow some covariates to have missing values. We consider only the cases by which the missing covariates are categorical and implement the EM algorithm via the method of weights for maximum likelihood estimation. We present a Monte Carlo simulation experiment to compare the properties of the estimators based on this method with those estimators under the complete case scenario. We also evaluate, in this experiment, the impact in the parameter estimates when we increase the proportion of immune and censored individuals among the not immune one. We demonstrate the proposed methodology with a real data set involving the time until the graduation for the undergraduate course of Statistics of the Universidade Federal do Rio Grande do Norte / Neste trabalho estudamos o modelo de sobreviv^encia com fra??o de cura proposto por Yakovlev et al. (1993) que possui uma estrutura de riscos competitivos. Covari?veis s?o introduzidas para modelar o n?mero m?dio de riscos e permitimos que algumas destas covari?veis apresentem omiss?o. Consideramos apenas os casos em que as covari?veis omissas s?o categ?ricas e as estimativas dos par?metros s?o obtidas atrav?s do algoritmo EM ponderado. Apresentamos uma s?rie de simula??es para confrontar as estimativas obtidas atrav?s deste m?todo com as obtidas quando se exclui do banco de dados as observa??es que apresentam omiss?o, conhecida como an?lise de casos completos. Avaliamos tamb?m atrav?s de simula??es, o impacto na estimativa dos par?metros quando aumenta-se o percentual de curados e de censura entre indiv?duos n?o curados. Um conjunto de dados reais referentes ao tempo at? a conclus?o do curso de estat?stica na Universidade Federal do Rio Grande do Norte ? utilizado para ilustrar o m?todo. An?lise de sobreviv?ncia Fra??o de cura Vari?veis omissas Algoritmo EM Survival analysis Rate cure Missing data EM algorithm
142	Multivariate skew-normal/independent distributions: properties and inference / Multivariate skew-normal/independent distributions: properties and inference Lachos, Victor H., Labra, Filidor V. 25 September 2017 (has links) Liu (1996) discussed a class of robust normal/independent distributions which contains a group of thick-tailed cases. In this article, we develop a skewed version of these distributions in the multivariate setting, and we call them multivariate skew normal/independent distributions. We derive several useful properties for them. The main virtue of the members of this family is that they are easy to simulate and lend themselves to an EM-type algorithm for maximum likelihood estimation. For two multivariate models of practical interest, the EM-type algorithm has been discussed with emphasis on the skew-t, the skew-slash, and the contaminated skew-normal distributions. Results obtained from simulated and two real data sets are also reported. / Liu (1996) discute una clase de distribuciones robustas a las que apela como normal/independiente, y que contiene un grupo de distribuciones de colas pesadas. En este artículo desarrollamos una versión asimétrica de tales distribuciones en un escenario multivariado, a las que llamaremos distruciones normales asimétricas independientes multivariadas. Para tales distribuciones derivamos algunas propiedades. La principal virtud de los miembros de esta familia es que son fáciles de simular y se prestan a un algoritmo de tipo EM para realizar estimaciones de máxima verosimilitud de sus parámetros. Para dos modelos multivariados de interés práctico se discute el algoritmo EM con énfasis en las distribuciones t-asimétrica, slash asimétrica y normal asimétrica contaminada. Los resultados obtenidos a partir de simulaciones y de dos conjuntos de datos reales son reportados. Em Algorithm Normal/Independent Distributions Skewness Measurement Errors Models 62f03 62f05 62f10 62f12 Algoritmo Em Normal/Independiente 62f03 62f05 62f10 62f12
143	Modelos lineares generalizados mistos para dados longitudinais. / Generalized linear mixed models in longitudinal data. Silvano Cesar da Costa 13 March 2003 (has links) Experimentos cujas variaveis respostas s~ ao proporcoes ou contagens, sao muito comuns nas diversas areas do conhecimento, principalmente na area agricola. Na analise desses experimentos, utiliza-se a teoria de modelos lineares generalizados, bastante difundida (McCullagh & Nelder, 1989; Demetrio, 2001), em que as respostas sao independentes. Caso a variancia estimada seja maior do que a esperada, estima-se o parametro de dispersao, incluindo-o no processo de estimaçao dos parametros. Quando a variavel resposta e observada ao longo do tempo, pode haver uma correlacao entre as observacoes e isso tem que ser levado em consideracao na estimacao dos parametros. Uma forma de se trabalhar essa correlacao e aplicando a metodologia de equacoes de estimacao generalizada (EEG), discutida por Liang & Zeger (1986), embora, neste caso, o interesse esteja nas estimativas dos efeitos fixos e a inclusao da matriz de correlacao de trabalho sirva para se obter um melhor ajuste. Uma outra alternativa e a inclusao, no preditor linear, de um efeito latente para captar variabilidades nao consideradas no modelo e que podem in uenciar nos resultados. No presente trabalho, usa-se uma forma combinada de efeito aleatorio e parametro de dispersao, incluidos conjuntamente na estimacao dos parametros. Essa metodologia e aplicada a um conjunto de dados obtidos de um experimento com camu-camu, com objetivo de se avaliarem quais os melhores metodos de enxertia e tipos de porta-enxertos que podem ser utilizados, atraves da proporcao de pegamentos da muda. Varios modelos sao ajustados, desde o modelo em parcelas subdivididas (supondo independencia), ate o modelo em que se considera o parametro de dispersao e efeito aleatorio conjuntamente. Ha evidencias de que o modelo em que se inclui o efeito aleatorio e o parametro de dispersao, conjuntamente, resultam em melhores estimativas dos parametros. Outro conjunto de dados longitudinais, com milho transgenico MON810, em que a variavel resposta e o numero de lagartas (Spodoptera frugiperda), e utilizado. Neste caso, devido ao excesso de respostas zero, emprega-se o modelo de regressao Poisson in acionado de zeros (ZIP), alem do modelo Poisson padrao, em que as observacoes sao consideradas independentes, e do modelo Poisson in acionado de zeros com efeito aleatorio. Os resultados mostram que o efeito aleatorio incluido no preditor foi nao significativo e, assim, o modelo adotado e o modelo de regressao Poisson in acionado de zeros. Os resultados foram obtidos usando-se os procedimentos NLMIXED, GENMOD e GPLOT do SAS - Statistical Analysis System, versao 8.2. / Experiments which response variables are proportions or counts are very common in several research areas, specially in the area of agriculture. The theory of generalized linear models, well difused (McCullagh & Nelder, 1989; Demetrio, 2001), is used for analyzing these experiments where the responses are independent. If the estimated variance is greater than the expected variance, the dispersion parameter is estimated including it on the parameter estimation process. When the response variable is observed over time a correlation among observations might occur and it should be taken into account in the parameter estimation. A way of dealing with this correlation is applying the methodology of generalized estimating equations (GEEs) discussed by Liang & Zeger (1986) although, in this case, the interest is on the estimates of the xed efect being the inclusion of a working correlation matrix useful to obtain more accurate estimates. Another alternative is the inclusion of a latent efect in the linear predictor to explain variabilities not considered in the model that might in uence the results. In this work the random efect and the dispersion parameter are combined and included together in the parameter estimation. Such methodology is applied to a data set obtained from an experiment realized with camu-camu to evaluate, through proportion of grafting well successful of seedling, which kind of grafting and understock are suitable to be used. Several models are fitted, since the split plot model (with independence assumption) up to the model where the dispersion parameter and the random efect are considered together. There is evidence that the model including the random efect and the dispersion parameter together, produce better estimates of the parameters. Another longitudinal data set used here comes from an experiment realized with the MON810 transgenic corn where the response variable is the number of caterpillars (Spodoptera frugiperda). In this case, due to the excessive number of zeros obtained, the zero in ated Poisson regression model (ZIP) is used in addition to the standard Poisson model, where observations are considered independent, and the zero in ated Poisson regression model with random efect. The results show that the random efect included in the linear predictor was not significant and, therefore, the adopted model is the zero in ated Poisson regression model. The results were obtained using the procedures NLMIXED, GENMOD and GPLOT available on SAS - Statistical Analysis System, version 8.2. análise de dados longitudinais distribuição binomial distribuição de poisson modelos lineares generalizados SAS (programa de computador) binomial distribution em algorithm generalized linear mixed models generalized linear models poisson distribution
144	Distribuição slash multivariada aplicada a dados agrícolas / Multivariate slash distribution applied to agricultural data Fagundes, Regiane Slongo 17 January 2017 (has links) Submitted by Neusa Fagundes (neusa.fagundes@unioeste.br) on 2017-09-25T18:57:03Z No. of bitstreams: 1 Regiane_Fagundes2017.pdf: 6331934 bytes, checksum: faab7007f3c7c2e91c6bf26bc30fea8e (MD5) / Made available in DSpace on 2017-09-25T18:57:03Z (GMT). No. of bitstreams: 1 Regiane_Fagundes2017.pdf: 6331934 bytes, checksum: faab7007f3c7c2e91c6bf26bc30fea8e (MD5) Previous issue date: 2017-01-17 / Fundação Araucária de Apoio ao Desenvolvimento Científico e Tecnológico do Estado do Paraná (FA) / This study aimed at a discussing problems of multivariate statistical inference and linear spatial modeling when observations are from a continuous, symmetric population, with multivariate slash distribution. Firstly, a reparametrization of slash distribution was performed, assuming the existence of the finite second moment. Thus, some iterant properties were shown. Analytical expressions were tested for the score function and Fisher information matrix of reparameterized distribution. An approach to estimate some parameters by maximum likelihood was considered based at the EM (Expectation-Maximization) algorithm. Linear hypothesis tests have been described regarding the means vector and the covariance matrix using statistics such as C(α), likelihood ratio, Wald, and score. Studies of simulation were carried out to evaluate the efficiency of the statistical tests and EM algorithm. Data related to the agricultural area illustrated the methodology developed, and the hypothesis tests for equality of means, sphericity and equicorrelation were also applied. A slash linear spatial model, with and without the use of covariates, was proposed. Were Discussed the global and local influence diagnostic analysis in order to evaluate the influence of observations on the process of parameters’estimation. The curvatures required for the local influence procedure and based on the slash model were derived, in which the perturbation scheme has been chosen properly and related to the different perturbation schemes. Spatial variability maps of chemical attributes of soil and yield were generated by kriging with external drift. Finally results of simulations and applications indicated that the slash distribution is a robust alternative when the data present high kurtosis. / O objetivo deste trabalho foi discutir problemas de inferência estatística multivariada e de modelagem espacial quando as observações são provenientes de uma população contínua, simétrica, com distribuição slash multivariada. Inicialmente, foi realizada uma reparametrização da distribuição slash supondo existência do segundo momento finito, sendo apresentadas algumas propriedades recorrentes. Provaram-se expressões analíticas para a função escore e matriz de informação de Fisher da distribuição reparametrizada. Abordou-se um enfoque para a estimação dos parâmetros por máxima verossimilhança considerando um algoritmo do tipo EM (Esperança-Maximização). Descreveu-se a prova de hipóteses lineares sob o vetor de médias e matriz de covariância com o uso das estatísticas C(α), razão de verossimilhança, Wald e score. Estudos de simulação foram realizados para avaliar a eficiência dos testes estatísticos e do algoritmo EM. Dados relacionados à área agrícola ilustraram a metodologia desenvolvida, sendo aplicado sobre os mesmos os testes de igualdade de médias, esfericidade e equicorrelação. Como ilustração da aplicação da distribuição slash multivariada na área de modelagem estatística, o modelo espacial linear slash, com e sem o uso de covariáveis, foi discutido e proposto. Com o intuito de avaliar a influência das observações no processo de estimação dos parâmetros, discussões relacionadas à análise de diagnóstico, global e local, foram apresentadas. Derivaram-se as curvaturas requeridas no procedimento de influência local para o modelo slash, adequando o esquema de perturbação a distribuição e considerando diferentes esquemas de perturbação. Mapas de variabilidade espacial de atributos químicos do solo e produtividade foram gerados utilizando krigagem com drift externo. Os resultados das simulações e aplicações indicaram que a distribuição slash é uma alternativa robusta quando os dados apresentam alta curtose. Algoritmo EM Modelagem espacial linear slash Testes de hipóteses EM algorithm Global and local influence Slash linear spatial model Hypothesis tests CIENCIAS AGRARIAS::ENGENHARIA AGRICOLA
145	Training of Hidden Markov models as an instance of the expectation maximization algorithm Majewsky, Stefan 27 July 2017 (has links) (PDF) In Natural Language Processing (NLP), speech and text are parsed and generated with language models and parser models, and translated with translation models. Each model contains a set of numerical parameters which are found by applying a suitable training algorithm to a set of training data. Many such training algorithms are instances of the Expectation-Maximization (EM) algorithm. In [BSV15], a generic EM algorithm for NLP is described. This work presents a particular speech model, the Hidden Markov model, and its standard training algorithm, the Baum-Welch algorithm. It is then shown that the Baum-Welch algorithm is an instance of the generic EM algorithm introduced by [BSV15], from which follows that all statements about the generic EM algorithm also apply to the Baum-Welch algorithm, especially its correctness and convergence properties. Hidden-Markov-Modell Expectation-Maximization-Algorithmus EM-Algorithmus Baum-Welch-Algorithmus hidden markov model expectation-maximization algorithm EM algorithm Baum-Welch algorithm ddc:510 rvk:SK 820
146	Parallel Tomographic Image Reconstruction On Hierarchical Bus-Based And Extended Hypercube Architectures Rajan, K 07 1900 (has links) (PDF) No description available. Image Processing - Digital Techniques Tomography Extended Hypercube Positron Emission Tomography PET Image Reconstruction Expected Maximization Algorithm EM Algorithm Pixel-Based Reconstruction Algorithm PBR Algorithm Electronic Engineering
147	Apprentissage supervisé à partir des multiples annotateurs incertains / Supervised Learning from Multiple Uncertain Annotators Wolley, Chirine 01 December 2014 (has links) En apprentissage supervisé, obtenir les réels labels pour un ensemble de données peut être très fastidieux et long. Aujourd'hui, les récentes avancées d'Internet ont permis le développement de services d'annotations en ligne, faisant appel au crowdsourcing pour collecter facilement des labels. Néanmoins, le principal inconvénient de ces services réside dans le fait que les annotateurs peuvent avoir des niveaux d'expertise très hétérogènes. De telles données ne sont alors pas forcément fiables. Par conséquent, la gestion de l'incertitude des annotateurs est un élément clé pour l'apprentissage à partir de multiples annotateurs non experts. Dans cette thèse, nous proposons des algorithmes probabilistes qui traitent l'incertitude des annotateurs et la qualité des données durant la phase d'apprentissage. Trois modèles sont proposés: IGNORE permet de classer de nouvelles instances tout en évaluant les annotateurs en terme de performance d'annotation qui dépend de leur incertitude. X-IGNORE intègre la qualité des données en plus de l'incertitude des juges. En effet, X-IGNORE suppose que la performance des annotateurs dépend non seulement de leur incertitude mais aussi de la qualité des données qu'ils annotent. Enfin, ExpertS répond au problème de sélection d'annotateurs durant l'apprentissage. ExpertS élimine les annotateurs les moins performants, et se base ainsi uniquement sur les labels des bons annotateurs (experts) lors de l'étape d'apprentissage. De nombreuses expérimentations effectuées sur des données synthétiques et réelles montrent la performance et la stabilité de nos modèles par rapport à différents algorithmes de la littérature. / In supervised learning tasks, obtaining the ground truth label for each instance of the training dataset can be difficult, time-consuming and/or expensive. With the advent of infrastructures such as the Internet, an increasing number of web services propose crowdsourcing as a way to collect a large enough set of labels from internet users. The use of these services provides an exceptional facility to collect labels from anonymous annotators, and thus, it considerably simplifies the process of building labels datasets. Nonetheless, the main drawback of crowdsourcing services is their lack of control over the annotators and their inability to verify and control the accuracy of the labels and the level of expertise for each labeler. Hence, managing the annotators' uncertainty is a clue for learning from imperfect annotations. This thesis provides three algorithms when learning from multiple uncertain annotators. IGNORE generates a classifier that predict the label of a new instance and evaluate the performance of each annotator according to their level of uncertainty. X-Ignore, considers that the performance of the annotators both depends on their uncertainty and on the quality of the initial dataset to be annotated. Finally, ExpertS deals with the problem of annotators' selection when generating the classifier. It identifies experts annotators, and learn the classifier based only on their labels. We conducted in this thesis a large set of experiments in order to evaluate our models, both using experimental and real world medical data. The results prove the performance and accuracy of our models compared to previous state of the art solutions in this context. Apprentissage supervisé Incertitude Multiple annotateurs Expertise Qualité des données Analyse bayésienne Algorithme EM Supervised learning Uncertainty Multiple annotators Properties of labelers Data quality Bayesian analysis EM algorithm 004
148	Approche EM pour modèles multi-blocs à facteurs à une équation structurelle / EM estimation of a structural equation model Tami, Myriam 12 July 2016 (has links) Les modèles d'équations structurelles à variables latentes permettent de modéliser des relations entre des variables observables et non observables. Les deux paradigmes actuels d'estimation de ces modèles sont les méthodes de moindres carrés partiels sur composantes et l'analyse de la structure de covariance. Dans ce travail, après avoir décrit les deux principales méthodes d'estimation que sont PLS et LISREL, nous proposons une approche d'estimation fondée sur la maximisation par algorithme EM de la vraisemblance globale d'un modèle à facteurs latents et à une équation structurelle. Nous en étudions les performances sur des données simulées et nous montrons, via une application sur des données réelles environnementales, comment construire pratiquement un modèle et en évaluer la qualité. Enfin, nous appliquons l'approche développée dans le contexte d'un essai clinique en cancérologie pour l'étude de données longitudinales de qualité de vie. Nous montrons que par la réduction efficace de la dimension des données, l'approche EM simplifie l'analyse longitudinale de la qualité de vie en évitant les tests multiples. Ainsi, elle contribue à faciliter l'évaluation du bénéfice clinique d'un traitement. / Structural equation models enable the modeling of interactions between observed variables and latent ones. The two leading estimation methods are partial least squares on components and covariance-structure analysis. In this work, we first describe the PLS and LISREL methods and, then, we propose an estimation method using the EM algorithm in order to maximize the likelihood of a structural equation model with latent factors. Through a simulation study, we investigate how fast and accurate the method is, and thanks to an application to real environmental data, we show how one can handly construct a model or evaluate its quality. Finally, in the context of oncology, we apply the EM approach on health-related quality-of-life data. We show that it simplifies the longitudinal analysis of quality-of-life and helps evaluating the clinical benefit of a treatment. Modèles à équations structurelles Modèles à facteurs Variables latentes Algorithme EM Méthodes d'estimation Analyse de données Structural Equation Models Factors models Latent variables EM algorithm Estimation methods Data analysis
149	"Uma aplicação industrial de regressão binária com erros na variável explicativa" / "An industrial application of binary regression with errors-in-variable explanatory" Daniel Fernando de Favari 22 June 2006 (has links) Neste trabalho, aplicamos um modelo de regressão binária com erros de medição na variável explicativa para analisar sistemas de medição do tipo atributo. Para isto, utilizamos o modelo logístico com erros na variável, para o qual obtemos as estimativas de máxima verossimilhança via o algoritmo EM e a matriz de informação de Fisher observada. Além disso, fizemos um estudo de simulação para compararmos o método analítico e os modelos logístico sem erros na variável (ingênuo) e logístico com erros na variável. Finalmente, aplicamos nossa metodologia para avaliarmos um sistema de medição passa/não passa da maior montadora de motores Diesel (MWM International). / In this work, we apply a study of binary regression model with errors-in-variable to analyze attributive measurement systems. For this, we use the logistic model with errors-in-variable to obtain parameter estimates of maximum likelihood through EM algorithm and the observed Fisher information matrix. In addition we do a simulation study to compare analytic method and the logistic model with and without measurement errors-in-variable. Finally, we apply our methodology to evaluate a attributive measurement system for the largest Diesel motor company of the world (MWM International). algoritmo EM erros de medição método analítico método Delta regressão binária teorema de Fieller analytic method binary regression Delta method EM algorithm Fieller's theorem measurement errors-in-variable
150	Análise dos resultados de ensaios de proficiência via modelos de regressão com variável explicativa aleatória / Analysis of proficiency tests results via regression models with random explanatory variable Aline Othon Montanari 21 June 2004 (has links) Em um programa de ensaio de prociência (EP) conduzido pelo Grupo de Motores, um grupo de onze laboratórios da área de temperatura realizaram medições em cinco pontos da escala de um termopar. Neste trabalho, propomos um modelo de regressão com variável explicativa X (aleatória) representando o termopar padrão que denominaremos por artefato e a variável dependente Y representando as medições dos laboratórios. O procedimento para a realização da comparação é simples, ambos termopares são colocados no forno e as diferenças entre as medições são registradas. Para a análise dos dados, vamos trabalhar com a diferença entre a diferença das medições do equipamento do laboratório e o artefato, e o valor de referência (que é determinado por 2 laboratórios que pertencem a Rede Brasileira de Calibração (RBC)). O erro de medição tem variância determinada por calibração, isto é, conhecida. Assim, vamos encontrar aproximações para as estimativas de máxima verossimilhança para os parâmetros do modelo via algoritmo EM. Além disso, propomos uma estratégia para avaliar a consistência dos laboratórios participantes do programa de EP / In a program of proficiency assay, a group of eleven laboratories of the temperature area had carried through measurements in ¯ve points on the scale of the thermopair. In this work, we propose a regression model with a random explanatory variable representing the temperature measured by the standard thermopair, which will be called device. The procedure for the comparison accomplishment is as follows. The device and the laboratory\'s thermopair to be tested are placed in the oven and the difererences between the measurements are registered. For the analysis of the data, the response variable is the diference between those diference and the reference value, which is determined by two laboratories that belong to the Brazilian Net of Calibration (RBC). The measurement error has variance determined by calibration which is known. Therefore, we ¯and the maximum likelihood estimates for the parameters of the model via EM algorithm. We consider a strategy to establish the consistency of the participant laboratories of the program of proficiency assay Algoritmo EM Comparação interlaboratorial Ensaios de proficiência Estimação Incerteza de medição Variável explicativa aleatória EM algorithm Estimation Interlaboratory comparisons Proficiency tests Random explanatory variable Uncertainty measure

Search results