Spelling suggestions: "subject:"kullbackleibler"" "subject:"kullbackdivergensen""
21 |
Uso dos métodos clássico e bayesiano para os modelos não-lineares heterocedásticos simétricos / Use of the classical and bayesian methods for nonlinear heterocedastic symmetric modelsMárcia Aparecida Centanin Macêra 21 June 2011 (has links)
Os modelos normais de regressão têm sido utilizados durante muitos anos para a análise de dados. Mesmo nos casos em que a normalidade não podia ser suposta, tentava-se algum tipo de transformação com o intuito de alcançar a normalidade procurada. No entanto, na prática, essas suposições sobre normalidade e linearidade nem sempre são satisfeitas. Como alternativas à técnica clássica, foram desenvolvidas novas classes de modelos de regressão. Nesse contexto, focamos a classe de modelos em que a distribuição assumida para a variável resposta pertence à classe de distribuições simétricas. O objetivo geral desse trabalho é a modelagem desta classe no contexto bayesiano, em particular a modelagem da classe de modelos não-lineares heterocedásticos simétricos. Vale ressaltar que esse trabalho tem ligação com duas linhas de pesquisa, a saber: a inferência estatística abordando aspectos da teoria assintótica e a inferência bayesiana considerando aspectos de modelagem e critérios de seleção de modelos baseados em métodos de simulação de Monte Carlo em Cadeia de Markov (MCMC). Uma primeira etapa consiste em apresentar a classe dos modelos não-lineares heterocedásticos simétricos bem como a inferência clássica dos parâmetros desses modelos. Posteriormente, propomos uma abordagem bayesiana para esses modelos, cujo objetivo é mostrar sua viabilidade e comparar a inferência bayesiana dos parâmetros estimados via métodos MCMC com a inferência clássica das estimativas obtidas por meio da ferramenta GAMLSS. Além disso, utilizamos o método bayesiano de análise de influência caso a caso baseado na divergência de Kullback-Leibler para detectar observações influentes nos dados. A implementação computacional foi desenvolvida no software R e para detalhes dos programas pode ser consultado aos autores do trabalho / The normal regression models have been used for many years for data analysis. Even in cases where normality could not be assumed, was trying to be some kind of transformation in order to achieve the normality sought. However, in practice, these assumptions about normality and linearity are not always satisfied. As alternatives to classical technique new classes of regression models were developed. In this context, we focus on the class of models in which the distribution assumed for the response variable belongs to the symmetric distributions class. The aim of this work is the modeling of this class in the bayesian context, in particular the modeling of the nonlinear models heteroscedastic symmetric class. Note that this work is connected with two research lines, the statistical inference addressing aspects of asymptotic theory and the bayesian inference considering aspects of modeling and criteria for models selection based on simulation methods Monte Carlo Markov Chain (MCMC). A first step is to present the nonlinear models heteroscedastic symmetric class as well as the classic inference of parameters of these models. Subsequently, we propose a bayesian approach to these models, whose objective is to show their feasibility and compare the estimated parameters bayesian inference by MCMC methods with the classical inference of the estimates obtained by GAMLSS tool. In addition, we use the bayesian method of influence analysis on a case based on the Kullback-Leibler divergence for detecting influential observations in the data. The computational implementation was developed in the software R and programs details can be found at the studys authors
|
22 |
Information Theoretical Measures for Achieving Robust Learning MachinesZegers, Pablo, Frieden, B., Alarcón, Carlos, Fuentes, Alexis 12 August 2016 (has links)
Information theoretical measures are used to design, from first principles, an objective function that can drive a learning machine process to a solution that is robust to perturbations in parameters. Full analytic derivations are given and tested with computational examples showing that indeed the procedure is successful. The final solution, implemented by a robust learning machine, expresses a balance between Shannon differential entropy and Fisher information. This is also surprising in being an analytical relation, given the purely numerical operations of the learning machine.
|
23 |
Model selection criteria in the presence of missing data based on the Kullback-Leibler discrepancySparks, JonDavid 01 December 2009 (has links)
An important challenge in statistical modeling involves determining an appropriate structural form for a model to be used in making inferences and predictions. Missing data is a very common occurrence in most research settings and can easily complicate the model selection problem. Many useful procedures have been developed to estimate parameters and standard errors in the presence of missing data;however, few methods exist for determining the actual structural form of a modelwhen the data is incomplete.
In this dissertation, we propose model selection criteria based on the Kullback-Leiber discrepancy that can be used in the presence of missing data. The criteria are developed by accounting for missing data using principles related to the expectation maximization (EM) algorithm and bootstrap methods. We formulate the criteria for three specific modeling frameworks: for the normal multivariate linear regression model, a generalized linear model, and a normal longitudinal regression model. In each framework, a simulation study is presented to investigate the performance of the criteria relative to their traditional counterparts. We consider a setting where the missingness is confined to the outcome, and also a setting where the missingness may occur in the outcome and/or the covariates. The results from the simulation studies indicate that our criteria provide better protection against underfitting than their traditional analogues.
We outline the implementation of our methodology for a general discrepancy measure. An application is presented where the proposed criteria are utilized in a study that evaluates the driving performance of individuals with Parkinson's disease under low contrast (fog) conditions in a driving simulator.
|
24 |
Effective Authorship Attribution in Large Document CollectionsZhao, Ying, ying.zhao@rmit.edu.au January 2008 (has links)
Techniques that can effectively identify authors of texts are of great importance in scenarios such as detecting plagiarism, and identifying a source of information. A range of attribution approaches has been proposed in recent years, but none of these are particularly satisfactory; some of them are ad hoc and most have defects in terms of scalability, effectiveness, and computational cost. Good test collections are critical for evaluation of authorship attribution (AA) techniques. However, there are no standard benchmarks available in this area; it is almost always the case that researchers have their own test collections. Furthermore, collections that have been explored in AA are usually small, and thus whether the existing approaches are reliable or scalable is unclear. We develop several AA collections that are substantially larger than those in literature; machine learning methods are used to establish the value of using such corpora in AA. The results, also used as baseline results in this thesis, show that the developed text collections can be used as standard benchmarks, and are able to clearly distinguish between different approaches. One of the major contributions is that we propose use of the Kullback-Leibler divergence, a measure of how different two distributions are, to identify authors based on elements of writing style. The results show that our approach is at least as effective as, if not always better than, the best existing attribution methods-that is, support vector machines-for two-class AA, and is superior for multi-class AA. Moreover our proposed method has much lower computational cost and is cheaper to train. Style markers are the key elements of style analysis. We explore several approaches to tokenising documents to extract style markers, examining which marker type works the best. We also propose three systems that boost the AA performance by combining evidence from various marker types, motivated from the observation that there is no one type of marker that can satisfy all AA scenarios. To address the scalability of AA, we propose the novel task of authorship search (AS), inspired by document search and intended for large document collections. Our results show that AS is reasonably effective to find documents by a particular author, even within a collection consisting of half a million documents. Beyond search, we also propose the AS-based method to identify authorship. Our method is substantially more scalable than any method published in prior AA research, in terms of the collection size and the number of candidate authors; the discrimination is scaled up to several hundred authors.
|
25 |
Algorithms and performance optimization for distributed radar automatic target recognitionWilcher, John S. 08 June 2015 (has links)
This thesis focuses upon automatic target recognition (ATR) with radar sensors. Recent advancements in ATR have included the processing of target signatures from multiple, spatially-diverse perspectives. The advantage of multiple perspectives in target classification results from the angular sensitivity of reflected radar transmissions. By viewing the target at different angles, the classifier has a better opportunity to distinguish between target classes. This dissertation extends recent advances in multi-perspective target classification by: 1) leveraging bistatic target reflectivity signatures observed from multiple, spatially-diverse radar sensors; and, 2) employing a statistical distance measure to identify radar sensor locations yielding improved classification rates.
The algorithms provided in this thesis use high resolution range (HRR) profiles – formed by each participating radar sensor – as input to a multi-sensor classification algorithm derived using the fundamentals of statistical signal processing. Improvements to target classification rates are demonstrated for multiple configurations of transmitter, receiver, and target locations. These improvements are shown to emanate from the multi-static characteristics of a target class’ range profile and not merely from non-coherent gain. The significance of dominant scatterer reflections is revealed in both classification performance and the “statistical distance” between target classes. Numerous simulations have been performed to interrogate the robustness of the derived classifier. Errors in target pose angle and the inclusion of camouflage, concealment, and deception (CCD) effects are considered in assessing the validity of the classifier. Consideration of different transmitter and receiver combinations and low signal-to-noise ratios are analyzed in the context of deterministic, Gaussian, and uniform target pose uncertainty models. Performance metrics demonstrate increases in classification rates of up to 30% for multiple-transmit, multiple-receive platform configurations when compared to multi-sensor monostatic configurations.
A distance measure between probable target classes is derived using information theoretic techniques pioneered by Kullback and Leibler. The derived measure is shown to suggest radar sensor placements yielding better target classification rates. The predicted placements consider two-platform and three-platform configurations in a single-transmit, multiple-receive environment. Significant improvements in classification rates are observed when compared to ad-hoc sensor placement. In one study, platform placements identified by the distance measure algorithm are shown to produce classification rates exceeding 98.8% of all possible platform placements.
|
26 |
Delay estimation in computer networksJohnson, Nicholas Alexander January 2010 (has links)
Computer networks are becoming increasingly large and complex; more so with the recent penetration of the internet into all walks of life. It is essential to be able to monitor and to analyse networks in a timely and efficient manner; to extract important metrics and measurements and to do so in a way which does not unduly disturb or affect the performance of the network under test. Network tomography is one possible method to accomplish these aims. Drawing upon the principles of statistical inference, it is often possible to determine the statistical properties of either the links or the paths of the network, whichever is desired, by measuring at the most convenient points thus reducing the effort required. In particular, bottleneck-link detection methods in which estimates of the delay distributions on network links are inferred from measurements made at end-points on network paths, are examined as a means to determine which links of the network are experiencing the highest delay. Initially two published methods, one based upon a single Gaussian distribution and the other based upon the method-of-moments, are examined by comparing their performance using three metrics: robustness to scaling, bottleneck detection accuracy and computational complexity. Whilst there are many published algorithms, there is little literature in which said algorithms are objectively compared. In this thesis, two network topologies are considered, each with three configurations in order to determine performance in six scenarios. Two new estimation methods are then introduced, both based on Gaussian mixture models which are believed to offer an advantage over existing methods in certain scenarios. Computationally, a mixture model algorithm is much more complex than a simple parametric algorithm but the flexibility in modelling an arbitrary distribution is vastly increased. Better model accuracy potentially leads to more accurate estimation and detection of the bottleneck. The concept of increasing flexibility is again considered by using a Pearson type-1 distribution as an alternative to the single Gaussian distribution. This increases the flexibility but with a reduced complexity when compared with mixture model approaches which necessitate the use of iterative approximation methods. A hybrid approach is also considered where the method-of-moments is combined with the Pearson type-1 method in order to circumvent problems with the output stage of the former. This algorithm has a higher variance than the method-of-moments but the output stage is more convenient for manipulation. Also considered is a new approach to detection algorithms which is not dependant on any a-priori parameter selection and makes use of the Kullback-Leibler divergence. The results show that it accomplishes its aim but is not robust enough to replace the current algorithms. Delay estimation is then cast in a different role, as an integral part of an algorithm to correlate input and output streams in an anonymising network such as the onion router (TOR). TOR is used by users in an attempt to conceal network traffic from observation. Breaking the encryption protocols used is not possible without significant effort but by correlating the un-encrypted input and output streams from the TOR network, it is possible to provide a degree of certainty about the ownership of traffic streams. The delay model is essential as the network is treated as providing a pseudo-random delay to each packet; having an accurate model allows the algorithm to better correlate the streams.
|
27 |
Diagnosability performance analysis of models and fault detectorsJung, Daniel January 2015 (has links)
Model-based diagnosis compares observations from a system with predictions using a mathematical model to detect and isolate faulty components. Analyzing which faults that can be detected and isolated given the model gives useful information when designing a diagnosis system. This information can be used, for example, to determine which residual generators can be generated or to select a sufficient set of sensors that can be used to detect and isolate the faults. With more information about the system taken into consideration during such an analysis, more accurate estimations can be computed of how good fault detectability and isolability that can be achieved. Model uncertainties and measurement noise are the main reasons for reduced fault detection and isolation performance and can make it difficult to design a diagnosis system that fulfills given performance requirements. By taking information about different uncertainties into consideration early in the development process of a diagnosis system, it is possible to predict how good performance can be achieved by a diagnosis system and avoid bad design choices. This thesis deals with quantitative analysis of fault detectability and isolability performance when taking model uncertainties and measurement noise into consideration. The goal is to analyze fault detectability and isolability performance given a mathematical model of the monitored system before a diagnosis system is developed. A quantitative measure of fault detectability and isolability performance for a given model, called distinguishability, is proposed based on the Kullback-Leibler divergence. The distinguishability measure answers questions like "How difficult is it to isolate a fault fi from another fault fj?. Different properties of the distinguishability measure are analyzed. It is shown for example, that for linear descriptor models with Gaussian noise, distinguishability gives an upper limit for the fault to noise ratio of any linear residual generator. The proposed measure is used for quantitative analysis of a nonlinear mean value model of gas flows in a heavy-duty diesel engine to analyze how fault diagnosability performance varies for different operating points. It is also used to formulate the sensor selection problem, i.e., to find a cheapest set of available sensors that should be used in a system to achieve required fault diagnosability performance. As a case study, quantitative fault diagnosability analysis is used during the design of an engine misfire detection algorithm based on the crankshaft angular velocity measured at the flywheel. Decisions during the development of the misfire detection algorithm are motivated using quantitative analysis of the misfire detectability performance showing, for example, varying detection performance at different operating points and for different cylinders to identify when it is more difficult to detect misfires. This thesis presents a framework for quantitative fault detectability and isolability analysis that is a useful tool during the design of a diagnosis system. The different applications show examples of how quantitate analysis can be applied during a design process either as feedback to an engineer or when formulating different design steps as optimization problems to assure that required performance can be achieved.
|
28 |
Collective reasoning under uncertainty and inconsistencyAdamcik, Martin January 2014 (has links)
In this thesis we investigate some global desiderata for probabilistic knowledge merging given several possibly jointly inconsistent, but individually consistent knowledge bases. We show that the most naive methods of merging, which combine applications of a single expert inference process with the application of a pooling operator, fail to satisfy certain basic consistency principles. We therefore adopt a different approach. Following recent developments in machine learning where Bregman divergences appear to be powerful, we define several probabilistic merging operators which minimise the joint divergence between merged knowledge and given knowledge bases. In particular we prove that in many cases the result of applying such operators coincides with the sets of fixed points of averaging projective procedures - procedures which combine knowledge updating with pooling operators of decision theory. We develop relevant results concerning the geometry of Bregman divergences and prove new theorems in this field. We show that this geometry connects nicely with some desirable principles which have arisen in the epistemology of merging. In particular, we prove that the merging operators which we define by means of convex Bregman divergences satisfy analogues of the principles of merging due to Konieczny and Pino-Perez. Additionally, we investigate how such merging operators behave with respect to principles concerning irrelevant information, independence and relativisation which have previously been intensively studied in case of single-expert probabilistic inference. Finally, we argue that two particular probabilistic merging operators which are based on Kullback-Leibler divergence, a special type of Bregman divergence, have overall the most appealing properties amongst merging operators hitherto considered. By investigating some iterative procedures we propose algorithms to practically compute them.
|
29 |
ASSOCIATION OF TOO SHORT ARCS USING ADMISSIBLE REGIONSurabhi Bhadauria (8695017) 24 April 2020 (has links)
<p>The near-Earth space is filled with over 300,000 artificial debris objects with a diameter larger than one cm. For objects in GEO and MEO region, the observations are made mainly through optical sensors. These sensors take observations over a short time which cover only a negligible part of the object's orbit. Two or more such observations are taken as one single Too Short Arc (TSA). Each set of TSA from an optical sensor consists of several angles, the angles of right ascension, declination, along with the rate of change of the right ascension angle and the declination angle. However, such observational data obtained from one TSA because it is covering only a very small fraction of the orbit, is not sufficient for the complete initial determination of an object's orbit. For a newly detected unknown object, only TSAs are available with no information about the orbit of the object. Therefore, two or more such TSAs that belong to the same object are required for its orbit determination. To solve this correlation problem, the framework of the probabilistic Admissible Region is used, which restricts possible orbits based on a single TSA. To propagate the Admissible Region to the time of a second TSA, it is represented in closed-form Gaussian Mixture representation. This way, a propagation with an Extended Kalman filter is possible. To decide if two TSAs are correlated, that is if they belong to the same object, respectively, an overlap between the regions is found in a suitable orbital mechanic's based coordinate frame. To compute the overlap, the information measure of Kullback-Leibler divergence is used. <br></p>
|
30 |
Quantifying Model Error in Bayesian Parameter EstimationWhite, Staci A. 08 October 2015 (has links)
No description available.
|
Page generated in 0.0291 seconds