Global ETD Search

1	Dispersijos minimizavimas renkant nelygių tikimybių imtį / On variance minimization for unequal probability sampling Čiginas, Andrius 02 July 2014 (has links) Darbe nagrinėjame ėmimo planus, kurių priklausymo imčiai tikimybės yra dviejų komponenčių mišiniai. Pirmoji komponentė yra proporcinga papildoma informacija nusakytam populiacijos elemento dydžiui, o antroji yra vienoda visiems elementams. Ieškome tokių mišinių, kurie minimizuoja įvairių populiacijos sumos įvertinių dispersijas ir parodome kaip, naudojantis papildoma informacija, apytiksliai nustatyti optimalų mišinį. Pateikiame teorinius ir kompiuterinio modeliavimo rezultatus Poisson'o imtims, renkamoms iš populiacijų, kurios yra generuotos naudojant tiesinės regresijos modelį. / We consider sampling designs, where inclusion (to sample) probabilities are mixtures of two components. The first component is proportional to the size of a population unit (described by means of an auxiliary information available). The second component is the same for every unit. We look for mixtures that minimize variances of various estimators of the population total and show how auxiliary information could help to find an approximate location of such mixtures. We report theoretical and simulation results in the case of Poisson samples drawn from populations which are generated by a linear regression model. Poisson mixture sampling Variance minimization
2	Hot Spot Identification and Analysis Methodology Farnsworth, Jacob S. 20 November 2013 (has links) (PDF) The Utah Department of Transportation (UDOT) Traffic and Safety Division continues to advance the safety of roadway sections throughout the state. To aid UDOT in meeting their goal the Department of Civil and Environmental Engineering at Brigham Young University (BYU) has worked with the Statistics Department in developing analysis tools for safety. The most recent of these tools has been the development of a hierarchical Bayesian Poisson Mixture Model (PMM) statistical model of traffic crashes and safety on UDOT roadways statewide and the integration of the results of this model in a Geographic Information System (GIS) framework. This research focuses on the enhancement of the framework for highway safety mitigation in Utah with its six primary steps: 1) network screening, 2) diagnosis, 3) countermeasure selection, 4) economic appraisal, 5) project prioritization, and 6) effectiveness evaluation. The framework was enhanced by developing a methodology for accomplishing the steps of network screening, diagnosis, and countermeasure selection. This methodology is titled, "Hot Spot Identification and Analysis." The hot spot identification and analysis methodology consists of the following seven steps: 1) identify problematic segments with safety concern, 2) identify problem spots within the segments, 3) micro analysis of problematic segments and spots, 4) defining the segment, 5) defining the problem, 6) evaluation of possible countermeasures, and 7) selection and recommendation of feasible countermeasures. The methodology is to help in the identification of hot spots with safety concerns so that they can be analyzed and countermeasures can be identified to mitigate the safety issues. Examples of how the methodology is to function are given with specific examples from Utah's state roadway network. Poisson Mixture Model hot spots countermeasures safety Civil and Environmental Engineering
3	A Gamma-Poisson topic model for short text Mazarura, Jocelyn Rangarirai January 2020 (has links) Most topic models are constructed under the assumption that documents follow a multinomial distribution. The Poisson distribution is an alternative distribution to describe the probability of count data. For topic modelling, the Poisson distribution describes the number of occurrences of a word in documents of fixed length. The Poisson distribution has been successfully applied in text classification, but its application to topic modelling is not well documented, specifically in the context of a generative probabilistic model. Furthermore, the few Poisson topic models in literature are admixture models, making the assumption that a document is generated from a mixture of topics. In this study, we focus on short text. Many studies have shown that the simpler assumption of a mixture model fits short text better. With mixture models, as opposed to admixture models, the generative assumption is that a document is generated from a single topic. One topic model, which makes this one-topic-per-document assumption, is the Dirichlet-multinomial mixture model. The main contributions of this work are a new Gamma-Poisson mixture model, as well as a collapsed Gibbs sampler for the model. The benefit of the collapsed Gibbs sampler derivation is that the model is able to automatically select the number of topics contained in the corpus. The results show that the Gamma-Poisson mixture model performs better than the Dirichlet-multinomial mixture model at selecting the number of topics in labelled corpora. Furthermore, the Gamma-Poisson mixture produces better topic coherence scores than the Dirichlet-multinomial mixture model, thus making it a viable option for the challenging task of topic modelling of short text. The application of GPM was then extended to a further real-world task: that of distinguishing between semantically similar and dissimilar texts. The objective was to determine whether GPM could produce semantic representations that allow the user to determine the relevance of new, unseen documents to a corpus of interest. The challenge of addressing this problem in short text from small corpora was of key interest. Corpora of small size are not uncommon. For example, at the start of the Coronavirus pandemic limited research was available on the topic. Handling short text is not only challenging due to the sparsity of such text, but some corpora, such as chats between people, also tend to be noisy. The performance of GPM was compared to that of word2vec under these challenging conditions on labelled corpora. It was found that the GPM was able to produce better results based on accuracy, precision and recall in most cases. In addition, unlike word2vec, GPM was shown to be applicable on datasets that were unlabelled and a methodology for this was also presented. Finally, a relevance index metric was introduced. This relevance index translates the similarity distance between a corpus of interest and a test document to the probability of the test document to be semantically similar to the corpus of interest. / Thesis (PhD (Mathematical Statistics))--University of Pretoria, 2020. / Statistics / PhD (Mathematical Statistics) / Unrestricted Topic modelling for short text Gamma-poisson mixture mixture models topic modelling document similarity
4	Use of Roadway Attributes in Hot Spot Identification and Analysis Bassett, David R. 01 July 2015 (has links) (PDF) The Utah Department of Transportation (UDOT) Traffic and Safety Division continues to advance the safety of roadway sections throughout the state. In an effort to aid UDOT in meeting their goal, the Department of Civil and Environmental Engineering at Brigham Young University (BYU) has worked with the Statistics Department in developing analysis tools for safety. The most recent of these tools has been the development of a hierarchical Bayesian Poisson Mixture Model (PMM) of traffic crashes known as the Utah Crash Prediction Model (UCPM), a hierarchical Bayesian Binomial statistical model known as the Utah Crash Severity Model (UCSM), and a Bayesian Horseshoe selection method. The UCPM and UCSM models helped with the analysis of safety on UDOT roadways statewide and the integration of the results of these models was applied to Geographic Information System (GIS) framework. This research focuses on the addition of roadway attributes in the selection and analysis of “hot spots.” This is in conjunction with the framework for highway safety mitigation migration in Utah with its six primary steps: network screening, diagnosis, countermeasure selection, economic appraisal, project prioritization, and effectiveness evaluation. The addition of roadway attributes was included as part of the network screening, diagnosis, and countermeasure selection, which are included in the methodology titled “Hot Spot Identification and Analysis.” Included in this research was the documentation of the steps and process for data preparation and model use for the step of network screening and the creation of one of the report forms for the steps of diagnosis and countermeasure selection. The addition of roadway attributes is required at numerous points in the process. Methods were developed to locate and evaluate the usefulness of available data. Procedures and systemization were created to convert raw data into new roadway attributes, such as grade and sag/crest curve location. For the roadway attributes to be useful in selection and analysis, methods were developed to combine and associate the attributes to crashes on problem segments and problem spots. The methodology for “Hot Spot Identification and Analysis” was enhanced to include steps for the inclusion and defining of the roadway attributes. These methods and procedures were used to help in the identification of safety hot spots so that they can be analyzed and countermeasures selected. Examples of how the methods are to function are given with sites from Utah’s state roadway network. Poisson Mixture Model crash analysis hot spots safety roadway attributes Civil and Environmental Engineering
5	Nonparametric estimation of the mixing distribution in mixed models with random intercepts and slopes Saab, Rabih 24 April 2013 (has links) Generalized linear mixture models (GLMM) are widely used in statistical applications to model count and binary data. We consider the problem of nonparametric likelihood estimation of mixing distributions in GLMM's with multiple random effects. The log-likelihood to be maximized has the general form l(G)=Σi log∫f(yi,γ) dG(γ) where f(.,γ) is a parametric family of component densities, yi is the ith observed response dependent variable, and G is a mixing distribution function of the random effects vector γ defined on Ω. The literature presents many algorithms for maximum likelihood estimation (MLE) of G in the univariate random effect case such as the EM algorithm (Laird, 1978), the intra-simplex direction method, ISDM (Lesperance and Kalbfleish, 1992), and vertex exchange method, VEM (Bohning, 1985). In this dissertation, the constrained Newton method (CNM) in Wang (2007), which fits GLMM's with random intercepts only, is extended to fit clustered datasets with multiple random effects. Owing to the general equivalence theorem from the geometry of mixture likelihoods (see Lindsay, 1995), many NPMLE algorithms including CNM and ISDM maximize the directional derivative of the log-likelihood to add potential support points to the mixing distribution G. Our method, Direct Search Directional Derivative (DSDD), uses a directional search method to find local maxima of the multi-dimensional directional derivative function. The DSDD's performance is investigated in GLMM where f is a Bernoulli or Poisson distribution function. The algorithm is also extended to cover GLMM's with zero-inflated data. Goodness-of-fit (GOF) and selection methods for mixed models have been developed in the literature, however their application in models with nonparametric random effects distributions is vague and ad-hoc. Some popular measures such as the Deviance Information Criteria (DIC), conditional Akaike Information Criteria (cAIC) and R2 statistics are potentially useful in this context. Additionally, some cross-validation goodness-of-fit methods popular in Bayesian applications, such as the conditional predictive ordinate (CPO) and numerical posterior predictive checks, can be applied with some minor modifications to suit the non-Bayesian approach. / Graduate / 0463 / rabihsaab@gmail.com geometry of mixture likelihood Poisson mixture model Bernoulli mixture model Direct Search directional derivative Zero-inflated data goodness-of-fit model selection
6	Nonparametric estimation of the mixing distribution in mixed models with random intercepts and slopes Saab, Rabih 24 April 2013 (has links) Generalized linear mixture models (GLMM) are widely used in statistical applications to model count and binary data. We consider the problem of nonparametric likelihood estimation of mixing distributions in GLMM's with multiple random effects. The log-likelihood to be maximized has the general form l(G)=Σi log∫f(yi,γ) dG(γ) where f(.,γ) is a parametric family of component densities, yi is the ith observed response dependent variable, and G is a mixing distribution function of the random effects vector γ defined on Ω. The literature presents many algorithms for maximum likelihood estimation (MLE) of G in the univariate random effect case such as the EM algorithm (Laird, 1978), the intra-simplex direction method, ISDM (Lesperance and Kalbfleish, 1992), and vertex exchange method, VEM (Bohning, 1985). In this dissertation, the constrained Newton method (CNM) in Wang (2007), which fits GLMM's with random intercepts only, is extended to fit clustered datasets with multiple random effects. Owing to the general equivalence theorem from the geometry of mixture likelihoods (see Lindsay, 1995), many NPMLE algorithms including CNM and ISDM maximize the directional derivative of the log-likelihood to add potential support points to the mixing distribution G. Our method, Direct Search Directional Derivative (DSDD), uses a directional search method to find local maxima of the multi-dimensional directional derivative function. The DSDD's performance is investigated in GLMM where f is a Bernoulli or Poisson distribution function. The algorithm is also extended to cover GLMM's with zero-inflated data. Goodness-of-fit (GOF) and selection methods for mixed models have been developed in the literature, however their application in models with nonparametric random effects distributions is vague and ad-hoc. Some popular measures such as the Deviance Information Criteria (DIC), conditional Akaike Information Criteria (cAIC) and R2 statistics are potentially useful in this context. Additionally, some cross-validation goodness-of-fit methods popular in Bayesian applications, such as the conditional predictive ordinate (CPO) and numerical posterior predictive checks, can be applied with some minor modifications to suit the non-Bayesian approach. / Graduate / 0463 / rabihsaab@gmail.com geometry of mixture likelihood Poisson mixture model Bernoulli mixture model Direct Search directional derivative Zero-inflated data goodness-of-fit model selection
7	高解析離散逆轉換方法之應用 / Applications on High-Resolution Inversion of the Discrete Transformation 林志哲, Lin, Chih Che Unknown Date (has links) 高解析法是訊號處理上處理離散訊號系統的重要方法，將之運用在有限二項組合及有限卜瓦松組合的參數解上，可以得到很好的結果。本篇論文除了將高解析法在有限二項組合及有限卜瓦松之運用作一探討外，並在有限二項組合的基礎上推論有限負二項組合在特殊情況下的解法。 / High-resolution inversion method is an important technique to solve discrete signal system problems. We can solve finite binomial mixtures and finite Poisson mixture problems with high-resolution inversion method. In thispaper, we shall provide and discuss how the high-resolution inversion method used in finite binomial mixtures and finite Poisson mixture problems.In addition,we shall also extend our methods to solve finite negative binomial mixture problems with some assumptions same as those for finitebinomial mixtures. 高解析二項組合卜瓦松組合負二項組合傅利葉轉換 High-resolution binomial mixture Poisson mixture negative binomial mixture Fourier transformation
8	Advanced signal processing techniques for multi-target tracking Daniyan, Abdullahi January 2018 (has links) The multi-target tracking problem essentially involves the recursive joint estimation of the state of unknown and time-varying number of targets present in a tracking scene, given a series of observations. This problem becomes more challenging because the sequence of observations is noisy and can become corrupted due to miss-detections and false alarms/clutter. Additionally, the detected observations are indistinguishable from clutter. Furthermore, whether the target(s) of interest are point or extended (in terms of spatial extent) poses even more technical challenges. An approach known as random finite sets provides an elegant and rigorous framework for the handling of the multi-target tracking problem. With a random finite sets formulation, both the multi-target states and multi-target observations are modelled as finite set valued random variables, that is, random variables which are random in both the number of elements and the values of the elements themselves. Furthermore, compared to other approaches, the random finite sets approach possesses a desirable characteristic of being free of explicit data association prior to tracking. In addition, a framework is available for dealing with random finite sets and is known as finite sets statistics. In this thesis, advanced signal processing techniques are employed to provide enhancements to and develop new random finite sets based multi-target tracking algorithms for the tracking of both point and extended targets with the aim to improve tracking performance in cluttered environments. To this end, firstly, a new and efficient Kalman-gain aided sequential Monte Carlo probability hypothesis density (KG-SMC-PHD) filter and a cardinalised particle probability hypothesis density (KG-SMC-CPHD) filter are proposed. These filters employ the Kalman- gain approach during weight update to correct predicted particle states by minimising the mean square error between the estimated measurement and the actual measurement received at a given time in order to arrive at a more accurate posterior. This technique identifies and selects those particles belonging to a particular target from a given PHD for state correction during weight computation. The proposed SMC-CPHD filter provides a better estimate of the number of targets. Besides the improved tracking accuracy, fewer particles are required in the proposed approach. Simulation results confirm the improved tracking performance when evaluated with different measures. Secondly, the KG-SMC-(C)PHD filters are particle filter (PF) based and as with PFs, they require a process known as resampling to avoid the problem of degeneracy. This thesis proposes a new resampling scheme to address a problem with the systematic resampling method which causes a high tendency of resampling very low weight particles especially when a large number of resampled particles are required; which in turn affect state estimation. Thirdly, the KG-SMC-(C)PHD filters proposed in this thesis perform filtering and not tracking , that is, they provide only point estimates of target states but do not provide connected estimates of target trajectories from one time step to the next. A new post processing step using game theory as a solution to this filtering - tracking problem is proposed. This approach was named the GTDA method. This method was employed in the KG-SMC-(C)PHD filter as a post processing technique and was evaluated using both simulated and real data obtained using the NI-USRP software defined radio platform in a passive bi-static radar system. Lastly, a new technique for the joint tracking and labelling of multiple extended targets is proposed. To achieve multiple extended target tracking using this technique, models for the target measurement rate, kinematic component and target extension are defined and jointly propagated in time under the generalised labelled multi-Bernoulli (GLMB) filter framework. The GLMB filter is a random finite sets-based filter. In particular, a Poisson mixture variational Bayesian (PMVB) model is developed to simultaneously estimate the measurement rate of multiple extended targets and extended target extension was modelled using B-splines. The proposed method was evaluated with various performance metrics in order to demonstrate its effectiveness in tracking multiple extended targets.

Search results