131 |
Statistical approaches for natural language modelling and monotone statistical machine translationAndrés Ferrer, Jesús 11 February 2010 (has links)
Esta tesis reune algunas contribuciones al reconocimiento de formas estadístico y, más especícamente, a varias tareas del procesamiento del lenguaje natural. Varias técnicas estadísticas bien conocidas se revisan en esta tesis, a saber: estimación paramétrica, diseño de la función de pérdida y modelado estadístico. Estas técnicas se aplican a varias tareas del procesamiento del lenguajes natural tales como clasicación de documentos, modelado del lenguaje natural
y traducción automática estadística.
En relación con la estimación paramétrica, abordamos el problema del suavizado proponiendo una nueva técnica de estimación por máxima verosimilitud con dominio restringido (CDMLEa ). La técnica CDMLE evita la necesidad de la etapa de suavizado que propicia la pérdida de las propiedades del estimador máximo verosímil. Esta técnica se aplica a clasicación de documentos mediante el clasificador Naive Bayes. Más tarde, la técnica CDMLE se extiende a la estimación por máxima verosimilitud por leaving-one-out aplicandola al suavizado de modelos de lenguaje. Los resultados obtenidos en varias tareas de modelado del lenguaje natural, muestran una mejora en términos de perplejidad.
En a la función de pérdida, se estudia cuidadosamente el diseño de funciones de pérdida diferentes a la 0-1. El estudio se centra en aquellas funciones de pérdida que reteniendo una complejidad de decodificación similar a la función 0-1, proporcionan una mayor flexibilidad. Analizamos y presentamos varias funciones de pérdida en varias tareas de traducción automática y con varios modelos de traducción. También, analizamos algunas reglas de traducción que destacan por causas prácticas tales como la regla de traducción directa; y, así mismo, profundizamos en la comprensión de los modelos log-lineares, que son de hecho, casos particulares de funciones de pérdida.
Finalmente, se proponen varios modelos de traducción monótonos basados en técnicas de modelado estadístico . / Andrés Ferrer, J. (2010). Statistical approaches for natural language modelling and monotone statistical machine translation [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/7109
|
132 |
Dynamic Programming Approaches for Estimating and Applying Large-scale Discrete Choice ModelsMai, Anh Tien 12 1900 (has links)
People go through their life making all kinds of decisions, and some of these decisions affect their demand for transportation, for example, their choices of where to live and where to work, how and when to travel and which route to take. Transport related choices are typically time dependent and characterized by large number of alternatives that can be spatially correlated. This thesis deals with models that can be used to analyze and predict discrete choices in large-scale networks. The proposed models and methods are highly relevant for, but not limited to, transport applications.
We model decisions as sequences of choices within the dynamic discrete choice framework, also known as parametric Markov decision processes. Such models are known to be difficult to estimate and to apply to make predictions because dynamic programming problems need to be solved in order to compute choice probabilities. In this thesis we show that it is possible to explore the network structure and the flexibility of dynamic programming so that the dynamic discrete choice modeling approach is not only useful to model time dependent choices, but also makes it easier to model large-scale static choices.
The thesis consists of seven articles containing a number of models and methods for estimating, applying and testing large-scale discrete choice models. In the following we group the contributions under three themes: route choice modeling, large-scale multivariate extreme value (MEV) model estimation and nonlinear optimization algorithms.
Five articles are related to route choice modeling. We propose different dynamic discrete choice models that allow paths to be correlated based on the MEV and mixed logit models. The resulting route choice models become expensive to estimate and we deal with this challenge by proposing innovative methods that allow to reduce the estimation cost. For example, we propose a decomposition method that not only opens up for possibility of mixing, but also speeds up the estimation for simple logit models, which has implications also for traffic simulation. Moreover, we compare the utility maximization and regret minimization decision rules, and we propose a misspecification test for logit-based route choice models.
The second theme is related to the estimation of static discrete choice models with large choice sets.
We establish that a class of MEV models can be reformulated as dynamic discrete choice models on the networks of correlation structures. These dynamic models can then be estimated quickly using dynamic programming techniques and an efficient nonlinear optimization algorithm.
Finally, the third theme focuses on structured quasi-Newton techniques for estimating discrete choice models by maximum likelihood. We examine and adapt switching methods that can be easily integrated into usual optimization algorithms (line search and trust region) to accelerate the estimation process.
The proposed dynamic discrete choice models and estimation methods can be used in various discrete choice applications. In the area of big data analytics, models that can deal with large choice sets and sequential choices are important.
Our research can therefore be of interest in various demand analysis applications (predictive analytics) or can be integrated with optimization models (prescriptive analytics). Furthermore, our studies indicate the potential of dynamic programming techniques in this context, even for static models, which opens up a variety of future research directions. / Les gens consacrent une importante part de leur existence à prendre diverses décisions, pouvant affecter leur demande en transport, par exemple les choix de lieux d'habitation et de travail, les modes de transport, les heures de départ, le nombre et type de voitures dans le ménage, les itinéraires ... Les choix liés au transport sont généralement fonction du temps et caractérisés par un grand nombre de solutions alternatives qui peuvent être spatialement corrélées. Cette thèse traite de modèles pouvant être utilisés pour analyser et prédire les choix discrets dans les applications liées aux réseaux de grandes tailles. Les modèles et méthodes proposées sont particulièrement pertinents pour les applications en transport, sans toutefois s'y limiter.
Nous modélisons les décisions comme des séquences de choix, dans le cadre des choix discrets dynamiques, aussi connus comme processus de décision de Markov paramétriques. Ces modèles sont réputés difficiles à estimer et à appliquer en prédiction, puisque le calcul des probabilités de choix requiert la résolution de problèmes de programmation dynamique. Nous montrons dans cette thèse qu'il est possible d'exploiter la structure du réseau et la flexibilité de la programmation dynamique afin de rendre l'approche de modélisation dynamique en choix discrets non seulement utile pour représenter les choix dépendant du temps, mais également pour modéliser plus facilement des choix statiques au sein d'ensembles de choix de très grande taille.
La thèse se compose de sept articles, présentant divers modèles et méthodes d'estimation, leur application ainsi que des expériences numériques sur des modèles de choix discrets de grande taille. Nous regroupons les contributions en trois principales thématiques: modélisation du choix de route, estimation de modèles en valeur extrême multivariée (MEV) de grande taille et algorithmes d'optimisation non-linéaire.
Cinq articles sont associés à la modélisation de choix de route. Nous proposons différents modèles de choix discrets dynamiques permettant aux utilités des chemins d'être corrélées, sur base de formulations MEV et logit mixte.
Les modèles résultants devenant coûteux à estimer, nous présentons de nouvelles approches permettant de diminuer les efforts de calcul. Nous proposons par exemple une méthode de décomposition qui non seulement ouvre la possibilité d'estimer efficacement des modèles logit mixte, mais également d'accélérer l'estimation de modèles simples comme les modèles logit multinomiaux, ce qui a également des implications en simulation de trafic. De plus, nous comparons les règles de décision basées sur le principe de maximisation d'utilité de celles sur la minimisation du regret pour ce type de modèles. Nous proposons finalement un test statistique sur les erreurs de spécification pour les modèles de choix de route basés sur le logit multinomial.
Le second thème porte sur l'estimation de modèles de choix discrets statiques avec de grands ensembles de choix. Nous établissons que certains types de modèles MEV peuvent être reformulés comme des modèles de choix discrets dynamiques, construits sur des réseaux de structure de corrélation. Ces modèles peuvent alors être estimées rapidement en utilisant des techniques de programmation dynamique en combinaison avec un algorithme efficace d'optimisation non-linéaire.
La troisième et dernière thématique concerne les algorithmes d'optimisation non-linéaires dans le cadre de l'estimation de modèles complexes de choix discrets par maximum de vraisemblance. Nous examinons et adaptons des méthodes quasi-Newton structurées qui peuvent être facilement intégrées dans des algorithmes d'optimisation usuels (recherche linéaire et région de confiance) afin d'accélérer le processus d'estimation.
Les modèles de choix discrets dynamiques et les méthodes d'optimisation proposés peuvent être employés dans diverses applications de choix discrets. Dans le domaine des sciences de données, des modèles qui peuvent traiter de grands ensembles de choix et des ensembles de choix séquentiels sont importants. Nos recherches peuvent dès lors être d'intérêt dans diverses applications d'analyse de la demande (analyse prédictive) ou peuvent être intégrées à des modèles d'optimisation (analyse prescriptive). De plus, nos études mettent en évidence le potentiel des techniques de programmation dynamique dans ce contexte, y compris pour des modèles statiques, ouvrant la voie à de multiples directions de recherche future.
|
133 |
Drift estimation for jump diffusionsMai, Hilmar 08 October 2012 (has links)
Das Ziel dieser Arbeit ist die Entwicklung eines effizienten parametrischen Schätzverfahrens für den Drift einer durch einen Lévy-Prozess getriebenen Sprungdiffusion. Zunächst werden zeit-stetige Beobachtungen angenommen und auf dieser Basis eine Likelihoodtheorie entwickelt. Dieser Schritt umfasst die Frage nach lokaler Äquivalenz der zu verschiedenen Parametern auf dem Pfadraum induzierten Maße. Wir diskutieren in dieser Arbeit Schätzer für Prozesse vom Ornstein-Uhlenbeck-Typ, Cox-Ingersoll-Ross Prozesse und Lösungen linearer stochastischer Differentialgleichungen mit Gedächtnis im Detail und zeigen starke Konsistenz, asymptotische Normalität und Effizienz im Sinne von Hájek und Le Cam für den Likelihood-Schätzer. In Sprungdiffusionsmodellen ist die Likelihood-Funktion eine Funktion des stetigen Martingalanteils des beobachteten Prozesses, der im Allgemeinen nicht direkt beobachtet werden kann. Wenn nun nur Beobachtungen an endlich vielen Zeitpunkten gegeben sind, so lässt sich der stetige Anteil der Sprungdiffusion nur approximativ bestimmen. Diese Approximation des stetigen Anteils ist ein zentrales Thema dieser Arbeit und es wird uns auf das Filtern von Sprüngen führen. Der zweite Teil dieser Arbeit untersucht die Schätzung der Drifts, wenn nur diskrete Beobachtungen gegeben sind. Dabei benutzen wir die Likelihood-Schätzer aus dem ersten Teil und approximieren den stetigen Martingalanteil durch einen sogenannten Sprungfilter. Wir untersuchen zuerst den Fall endlicher Aktivität und zeigen, dass die Driftschätzer im Hochfrequenzlimes die effiziente asymptotische Verteilung erreichen. Darauf aufbauend beweisen wir dann im Falle unendlicher Sprungaktivität asymptotische Effizienz für den Driftschätzer im Ornstein-Uhlenbeck Modell. Im letzten Teil werden die theoretischen Ergebnisse für die Schätzer auf endlichen Stichproben aus simulierten Daten geprüft und es zeigt sich, dass das Sprungfiltern zu einem deutlichen Effizienzgewinn führen. / The problem of parametric drift estimation for a a Lévy-driven jump diffusion process is considered in two different settings: time-continuous and high-frequency observations. The goal is to develop explicit maximum likelihood estimators for both observation schemes that are efficient in the Hájek-Le Cam sense. The likelihood function based on time-continuous observations can be derived explicitly for jump diffusion models and leads to explicit maximum likelihood estimators for several popular model classes. We consider Ornstein-Uhlenbeck type, square-root and linear stochastic delay differential equations driven by Lévy processes in detail and prove strong consistency, asymptotic normality and efficiency of the likelihood estimators in these models. The appearance of the continuous martingale part of the observed process under the dominating measure in the likelihood function leads to a jump filtering problem in this context, since the continuous part is usually not directly observable and can only be approximated and the high-frequency limit. In the second part of this thesis the problem of drift estimation for discretely observed processes is considered. The estimators are constructed from discretizations of the time-continuous maximum likelihood estimators from the first part, where the continuous martingale part is approximated via a thresholding technique. We are able to proof that even in the case of infinite activity jumps of the driving Lévy process the estimator is asymptotically normal and efficient under weak assumptions on the jump behavior. Finally, the finite sample behavior of the estimators is investigated on simulated data. We find that the maximum likelihood approach clearly outperforms the least squares estimator when jumps are present and that the efficiency gap between both techniques becomes even more severe with growing jump intensity.
|
134 |
Antibrouillage de récepteur GNSS embarqué sur hélicoptère / Antijamming of GNSS receiver mounted on helicopterBarbiero, Franck 16 December 2014 (has links)
En environnements hostiles, les signaux GNSS (Global Navigation Satellite System)peuvent être soumis à des risques de brouillages intentionnels. Basées sur un réseau d'antennes adaptatif, les solutions spatio-temporelles (STAP) ont déjà montré de bonnes performances de réjection des interférences. Toutefois, lorsque le module GNSS est placé sous les pales d'un hélicoptère, des effets non-stationnaires, appelés Rotor Blade Modulation (RBM), créés par les multiples réflexions du signal sur les pales du rotor, peuvent dégrader les techniques usuelles d’antibrouillage. Le signal utile GNSS n’est alors plus accessible. Le travail de la thèse consiste donc à élaborer un système de protection des signaux GNSS adapté à la RBM. Pour cela, un modèle innovant de multitrajets, adapté à ce type de phénomène, a été développé. La comparaison de simulations électromagnétiques représentatives et de mesures expérimentales sur hélicoptère EC-120 a permis de valider ce modèle. Celui-ci permet d'estimer, par maximum de vraisemblance, les paramètres de la contribution non-stationnaire du signal reçu. Enfin, l'association d'un algorithme de filtrage des multitrajets par projection oblique et d'un traitement STAP permet d'éliminer la contribution dynamique puis statique de l'interférence. Les simulations montrent que le signal utile GNSS est alors de nouveau exploitable. / In hostile environments, Global Navigation Satellite System (GNSS) can be disturbed by intentional jamming. Using antenna arrays, space-time adaptive algorithm (STAP) isone of the most efficient methods to deal with these threats. However, when a GNSS receiver is placed near rotating bodies, non-stationary effects called Rotor Blade Modulation (RBM) are created by the multipaths on the blades of the helicopter. They can degrade significantly the anti-jamming system and the signal of interest could belost. The work of the thesis is, consequently, to develop a GNSS protection system adapted to the RBM. In this way, an innovative multipath model, adapted to this phenomenon, has been developed. The model is then confirmed by comparison with a symptotic electromagnetic simulations and experiments conducted on an EC-120helicopter. Using a Maximum Likelihood algorithm, the parameters of the non-stationary part of the received signal have been estimated. And finally, the RBM anti-jamming solution, combining oblique projection algorithm and academic STAP, can mitigate dynamic and static contributions of interferences. In the end, the navigation information is available again.
|
135 |
Estimación óptima de secuencias caóticas con aplicación en comunicacionesLuengo García, David 23 November 2006 (has links)
En esta Tesis se aborda la estimación óptima de señales caóticas generadas por mapas unidimensionales y contaminadas por ruido aditivo blanco Gaussiano, desde el punto de vista de los dos marcos de inferencia estadística más extendidos: máxima verosimilitud (ML) y Bayesiano. Debido al elevado coste computacional de estos estimadores, se proponen asimismo diversos estimadores subóptimos, aunque computacionalmente eficientes, con un rendimiento similar al de los óptimos. Adicionalmente se analiza el problema de la estimación de los parámetros de un mapa caótico explotando la relación conocida entre muestras consecutivas de la secuencia caótica. Por último, se considera la aplicación de los estimadores anteriores al diseño de receptores para dos esquemas de comunicaciones caóticas diferentes: conmutación caótica y codificación simbólica o caótica. / This Thesis studies the optimal estimation of chaoticsignals generated iterating unidimensional maps and contaminated by additive white Gaussian noise, from the point of view of the two most common frameworks in statistical inference: maximum likelihood (ML) and Bayesian. Due to the high computational cost of optimum estimators, several suboptimal but computationally efficient estimators are proposed, which attain a similar performance as the optimum ones. Additionally, the estimation of the parameters of a chaotic map is analyzed, exploiting the known relation between consecutive samples of the chaotic sequence. Finally, we consider the application of the estimators developed in the design of receivers for two different schemes of chaotic communications: chaotic switching and symbolic or chaotic coding.
|
136 |
具有額外或不足變異的群集類別資料之研究 / A Study of Modelling Categorical Data with Overdispersion or Underdispersion蘇聖珠, Su, Sheng-Chu Unknown Date (has links)
進行調查時,最後的抽樣單位常是從不同的群集取得的,而同一群集內的樣本對象,因背景類似而對於某些問題常會傾向相同或類似的反應,研究者若忽略這種群內相關性,仍以獨立性樣本進行分析時,因其共變異數矩陣通常會與多項模式的共變異數矩陣相差懸殊,而造成所謂的額外變異或不足變異的現象。本文在不同的情況下,提出了Dirichlet-Multinomial模式(簡稱DM模式)、擴展的DM模式、以及兩種平均數-共變異數矩陣模式,以適當的彙整所有的群集資料。並討論DM與EDM模式中相關之參數及格機率之最大概似估計法,且分別對此兩種平均數-共變異數矩陣模式,提出求導廣義最小平方估計的程序。此外,也針對幾種特殊的二維表及三維表結構,探討對應的參數及格機率之估計方法。並提出計算簡易的Score統計檢定量以判斷群內相關(intra-cluster correlation)之存在性,及判斷資料集具有額外或不足變異,而對於不同母體的群內相關同質性檢定亦提出討論。 / This paper presents a modelling method of analyzing categorical data with overdispersion or underdispersion. In many studies, data are collected from differ clusters, and members within the same cluster behave similary. Thus, the responses of members within the same cluster are not independent and the multinomial distribution is not the correct distribution for the observed counts. Therefore, the covariance matrix of the sample proportion vector tends to be much different from that of the multinomial model. We discuss four different models to fit counts data with overdispersion or underdispersion feature, witch include Dirichlet-Multinomial model (DM model), extended DM model (EDM model), and two mean-covariance models. Method of maximum-likelihood estimation is discussed for DM and EDM models. Procedures to derive generalized least squares estimates are proposed for the two mean-covariance models respectively. As to the cell probabilities, we also discuss how to estimate them under several special structures of two-way and three-way tables. More easily evaluated Score test statistics are derived for the DM and EDM models to test the existence of the intra-cluster correlation. And the test of homogeneity of intra-cluster correlation among several populations is also derived.
|
137 |
Eliminação de parâmetros perturbadores em um modelo de captura-recapturaSalasar, Luis Ernesto Bueno 18 November 2011 (has links)
Made available in DSpace on 2016-06-02T20:04:51Z (GMT). No. of bitstreams: 1
4032.pdf: 1016886 bytes, checksum: 6e1eb83f197a88332f8951b054c1f01a (MD5)
Previous issue date: 2011-11-18 / Financiadora de Estudos e Projetos / The capture-recapture process, largely used in the estimation of the number of elements of animal population, is also applied to other branches of knowledge like Epidemiology, Linguistics, Software reliability, Ecology, among others. One of the _rst applications of this method was done by Laplace in 1783, with aim at estimate the number of inhabitants of France. Later, Carl G. J. Petersen in 1889 and Lincoln in 1930 applied the same estimator in the context of animal populations. This estimator has being known in literature as _Lincoln-Petersen_ estimator. In the mid-twentieth century several researchers dedicated themselves to the formulation of statistical models appropriated for the estimation of population size, which caused a substantial increase in the amount of theoretical and applied works on the subject. The capture-recapture models are constructed under certain assumptions relating to the population, the sampling procedure and the experimental conditions. The main assumption that distinguishes models concerns the change in the number of individuals in the population during the period of the experiment. Models that allow for births, deaths or migration are called open population models, while models that does not allow for these events to occur are called closed population models. In this work, the goal is to characterize likelihood functions obtained by applying methods of elimination of nuissance parameters in the case of closed population models. Based on these likelihood functions, we discuss methods for point and interval estimation of the population size. The estimation methods are illustrated on a real data-set and their frequentist properties are analised via Monte Carlo simulation. / O processo de captura-recaptura, amplamente utilizado na estimação do número de elementos de uma população de animais, é também aplicado a outras áreas do conhecimento como Epidemiologia, Linguística, Con_abilidade de Software, Ecologia, entre outras. Uma das primeiras aplicações deste método foi feita por Laplace em 1783, com o objetivo de estimar o número de habitantes da França. Posteriormente, Carl G. J. Petersen em 1889 e Lincoln em 1930 utilizaram o mesmo estimador no contexto de popula ções de animais. Este estimador _cou conhecido na literatura como o estimador de _Lincoln-Petersen_. Em meados do século XX muitos pesquisadores se dedicaram à formula ção de modelos estatísticos adequados à estimação do tamanho populacional, o que causou um aumento substancial da quantidade de trabalhos teóricos e aplicados sobre o tema. Os modelos de captura-recaptura são construídos sob certas hipóteses relativas à população, ao processo de amostragem e às condições experimentais. A principal hipótese que diferencia os modelos diz respeito à mudança do número de indivíduos da popula- ção durante o período do experimento. Os modelos que permitem que haja nascimentos, mortes ou migração são chamados de modelos para população aberta, enquanto que os modelos em que tais eventos não são permitidos são chamados de modelos para popula- ção fechada. Neste trabalho, o objetivo é caracterizar o comportamento de funções de verossimilhança obtidas por meio da utilização de métodos de eliminação de parâmetros perturbadores, no caso de modelos para população fechada. Baseado nestas funções de verossimilhança, discutimos métodos de estimação pontual e intervalar para o tamanho populacional. Os métodos de estimação são ilustrados através de um conjunto de dados reais e suas propriedades frequentistas são analisadas via simulação de Monte Carlo.
|
138 |
Imputação de dados faltantes via algoritmo EM e rede neural MLP com o método de estimativa de máxima verossimilhança para aumentar a acurácia das estimativasRibeiro, Elisalvo Alves 14 August 2015 (has links)
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Database with missing values it is an occurrence often found in the real world, beiging of this problem caused by several reasons (equipment failure that transmits and stores the data, handler failure, failure who provides information, etc.). This may make the data inconsistent and unable to be analyzed, leading to very skewed conclusions. This dissertation aims to explore the use of Multilayer Perceptron Artificial Neural Network (ANN MLP), with new activation functions, considering two approaches (single imputation and multiple imputation). First, we propose the use of Maximum Likelihood Estimation Method (MLE) in each network neuron activation function, against the approach currently used, which is without the use of such a method or when is used only in the cost function (network output). It is then analyzed the results of these approaches compared with the Expectation Maximization algorithm (EM) is that the state of the art to treat missing data. The results indicate that when using the Artificial Neural Network MLP with Maximum Likelihood Estimation Method, both in all neurons and only in the output function, lead the an imputation with lower error. These experimental results, evaluated by metrics such as MAE (Mean Absolute Error) and RMSE (Root Mean Square Error), showed that the better results in most experiments occured when using the MLP RNA addressed in this dissertation to single imputation and multiple. / Base de dados com valores faltantes é uma ocorrência frequentemente encontrada no mundo real, sendo as causas deste problema são originadas por motivos diversos (falha no equipamento que transmite e armazena os dados, falha do manipulador, falha de quem fornece a informação, etc.). Tal situação pode tornar os dados inconsistentes e inaptos de serem analisados, conduzindo às conclusões muito enviesadas. Esta dissertação tem como objetivo explorar o emprego de Redes Neurais Artificiais Multilayer Perceptron (RNA MLP), com novas funções de ativação, considerando duas abordagens (imputação única e imputação múltipla). Primeiramente, é proposto o uso do Método de Estimativa de Máxima Verossimilhança (EMV) na função de ativação de cada neurônio da rede, em contrapartida à abordagem utilizada atualmente, que é sem o uso de tal método, ou quando o utiliza é apenas na função de custo (na saída da rede). Em seguida, são analisados os resultados destas abordagens em comparação com o algoritmo Expectation Maximization (EM) que é o estado da arte para tratar dados faltantes. Os resultados obtidos indicam que ao utilizar a Rede Neural Artificial MLP com o Método de Estimativa de Máxima Verossimilhança, tanto em todos os neurônios como apenas na função de saída, conduzem a uma imputação com menor erro. Os resultados experimentais foram avaliados via algumas métricas, sendo as principais o MAE (Mean Absolute Error) e RMSE (Root Mean Square Error), as quais apresentaram melhores resultados na maioria dos experimentos quando se utiliza a RNA MLP abordada neste trabalho para fazer imputação única e múltipla.
|
139 |
Observation error model selection by information criteria vs. normality testingLehmann, Rüdiger 17 October 2016 (has links) (PDF)
To extract the best possible information from geodetic and geophysical observations, it is necessary to select a model of the observation errors, mostly the family of Gaussian normal distributions. However, there are alternatives, typically chosen in the framework of robust M-estimation. We give a synopsis of well-known and less well-known models for observation errors and propose to select a model based on information criteria. In this contribution we compare the Akaike information criterion (AIC) and the Anderson Darling (AD) test and apply them to the test problem of fitting a straight line. The comparison is facilitated by a Monte Carlo approach. It turns out that the model selection by AIC has some advantages over the AD test.
|
140 |
Statistická analýza souborů s malým rozsahem / Statistical Analysis of Sample with Small SizeHolčák, Lukáš January 2008 (has links)
This diploma thesis is focused on the analysis of small samples where it is not possible to obtain more data. It can be especially due to the capital intensity or time demandingness. Where the production have not a wherewithall for the realization more data or absence of the financial resources. Of course, analysis of small samples is very uncertain, because inferences are always encumbered with the level of uncertainty.
|
Page generated in 0.1229 seconds