Global ETD Search

111	馬可夫鏈鎖之理論與應用 / A Study on the Applications of Markov Chain's Theory 高孔廉 Unknown Date (has links) 馬可夫鏈鎖(Markov Chain)理論係於一九０七年首由A.A. MarKov提出，初僅論及有限型，多年來經數學專家之研究，業經建立無限型之理論體系。近年來作業研究學者不斷研究，已將有限馬可夫鏈鎖在管理科學方面作廣泛的應用。然在國內，對此理論尚少討論，實際應用更未論及。本文主要在對此理論作一簡介，並探討其在應收賬款估計備抵壞賬問題之實際應用方法，期能藉此引起注意，對此一新興應用領域，鑽研學理，並使企業界能夠實際引用。本文共分六章，第一章緒論，敘述撰寫本文動機及搜集資料經過。第二章簡介理論，敘述有限馬可夫鏈鎖之理論體系，由於實例應用採用吸收馬可夫鏈鎖，故對其討論較多。第三章列舉馬可夫鏈鎖理論幾個重要的應用領域，諸如市場預測、決策問題等，第四章說明在動態規劃上的應用，即所謂馬可夫決策過程。第五章實例，首先簡介某公司業務及應收賬款概況，然後應用第二章之理論進行寶例計算並解釋其結果。第六章結論及建議，說明實例應用之理論模型的主要功能，並對某公司之業務及賬務處理提供建議。本文所需資料均自某公司原始資料歸併求算，在實際資料與理論模型如何配合方面，遭遇甚多困難，幸蒙吾師陸民仁教授、鄭子昊教授悉心指導及其公司之協助；在實例計算時，復蒙政大講師周汝及先生惠予協助，本文方能如期完成，在此謹致由衷謝忱。至於本文內容，因個人學識及時間之限制，誤漏之處，在所難免，敬祈閱卷教授指正。馬可夫鏈鎖理論模型 Markov Chain
112	Statistical Regular Pavings and their Applications Teng, Gloria Ai Hui January 2013 (has links) We propose using statistical regular pavings (SRPs) as an efficient and adaptive statistical data structure for processing massive, multi-dimensional data. A regular paving (RP) is an ordered binary tree that recursively bisects a box in $\Rz^{d}$ along the first widest side. An SRP is extended from an RP by allowing mutable caches of recursively computable statistics of the data. In this study we use SRPs for two major applications: estimating histogram densities and summarising large spatio-temporal datasets. The SRP histograms produced are $L_1$-consistent density estimators driven by a randomised priority queue that adaptively grows the SRP tree, and formalised as a Markov chain over the space of SRPs. A way to select an estimate is to run a Markov chain over the space of SRP trees, also initialised by the randomised priority queue, but here the SRP tree either shrinks or grows adaptively through pruning or splitting operations. The stationary distribution of the Markov chain is then the posterior distribution over the space of all possible histograms. We then take advantage of the recursive nature of SRPs to make computationally efficient arithmetic averages, and take the average of the states sampled from the stationary distribution to obtain the posterior mean histogram estimate. We also show that SRPs are capable of summarizing large datasets by working with a dataset containing high frequency aircraft position information. Recursively computable statistics can be stored for variable-sized regions of airspace. The regions themselves can be created automatically to reflect the varying density of aircraft observations, dedicating more computational resources and providing more detailed information in areas with more air traffic. In particular, SRPs are able to very quickly aggregate or separate data with different characteristics so that data describing individual aircraft or collected using different technologies (reflecting different levels of precision) can be stored separately and yet also very quickly combined using standard arithmetic operations. histogram estimation nonparametric density estimation Markov Chain Monte Carlo spatio-temporal data statistical regular pavings
113	Exact Markov chain Monte Carlo and Bayesian linear regression Bentley, Jason Phillip January 2009 (has links) In this work we investigate the use of perfect sampling methods within the context of Bayesian linear regression. We focus on inference problems related to the marginal posterior model probabilities. Model averaged inference for the response and Bayesian variable selection are considered. Perfect sampling is an alternate form of Markov chain Monte Carlo that generates exact sample points from the posterior of interest. This approach removes the need for burn-in assessment faced by traditional MCMC methods. For model averaged inference, we find the monotone Gibbs coupling from the past (CFTP) algorithm is the preferred choice. This requires the predictor matrix be orthogonal, preventing variable selection, but allowing model averaging for prediction of the response. Exploring choices of priors for the parameters in the Bayesian linear model, we investigate sufficiency for monotonicity assuming Gaussian errors. We discover that a number of other sufficient conditions exist, besides an orthogonal predictor matrix, for the construction of a monotone Gibbs Markov chain. Requiring an orthogonal predictor matrix, we investigate new methods of orthogonalizing the original predictor matrix. We find that a new method using the modified Gram-Schmidt orthogonalization procedure performs comparably with existing transformation methods, such as generalized principal components. Accounting for the effect of using an orthogonal predictor matrix, we discover that inference using model averaging for in-sample prediction of the response is comparable between the original and orthogonal predictor matrix. The Gibbs sampler is then investigated for sampling when using the original predictor matrix and the orthogonal predictor matrix. We find that a hybrid method, using a standard Gibbs sampler on the orthogonal space in conjunction with the monotone CFTP Gibbs sampler, provides the fastest computation and convergence to the posterior distribution. We conclude the hybrid approach should be used when the monotone Gibbs CFTP sampler becomes impractical, due to large backwards coupling times. We demonstrate large backwards coupling times occur when the sample size is close to the number of predictors, or when hyper-parameter choices increase model competition. The monotone Gibbs CFTP sampler should be taken advantage of when the backwards coupling time is small. For the problem of variable selection we turn to the exact version of the independent Metropolis-Hastings (IMH) algorithm. We reiterate the notion that the exact IMH sampler is redundant, being a needlessly complicated rejection sampler. We then determine a rejection sampler is feasible for variable selection when the sample size is close to the number of predictors and using Zellner’s prior with a small value for the hyper-parameter c. Finally, we use the example of simulating from the posterior of c conditional on a model to demonstrate how the use of an exact IMH view-point clarifies how the rejection sampler can be adapted to improve efficiency. perfect sampling Bayesian variable selection linear regression exact markov chain monte carlo
114	IEEE 802.16與802.11e整合環境的服務品質保證 / QoS Guarantee for IEEE 802.16 Integrating with 802.11e 張志華, Chang, Chih-Hua Unknown Date (has links) 802.16與802.11e均有提供服務品質(QoS)，但是其MAC並不相同，為了達到QoS的保證，我們使用馬可夫鍊(Markov Chain)模型分析在不同連線數量時802.11e EDCA的延遲時間(delay time)。然後，我們可以再利用允入控制(CAC)機制限制連線的數量以保證延遲時間的需求，並使用令牌桶(Token Bucket)機制，在滿足延遲及頻寬的需求下控制輸出流量，在我們的令牌桶機制中可以依照頻寬需求的變化自動調整令牌(Token)產生速率，最後使用封包丟棄機制提升吞吐量(throughput)。　　在提出我們的方法後，我們使用Qualnet模擬器驗證延遲時間、封包丟棄率及吞吐量，結果表示我們所提出的方法在三方面都有明顯的改進。 / IEEE 802.16 and 802.11e both provide Quality of Service (QoS), but the MAC of betweens is different. Ensuring the QoS guarantee, we use a Markov Chain model to analyze the 802.11e EDCA delay time under variance number of connections. Therefore, we can employ a CAC mechanism constraining the number of connections to guarantee the delay requirement. Further, considering the delay requirement and the bandwidth, we use a Token Bucket mechanism to throttle the traffic output that ensures the delay and bandwidth to be satisfied. And our Token Bucket mechanism can tune the token rate automatically by bandwidth requirement. Finally, we use the Packet Drop mechanism to improve throughput. After my methodology, we validate the delay, packet drop rate and throughput by simulator Qualnet. We have significant improvement in delay, drop rate, and throughput. 服務品質馬可夫鍊令牌桶 QoS Markov Chain Token Bucket
115	依序選擇四字串使第二字串或第四字串先出現的後選優勢探討 / On the first occurrence of four strings with teams 謝松樺, Hsieh, Sung Hua Unknown Date (has links) 本論文主要是在探討依序選擇四個字串之下，是否存在一策略使得第二或第四字串有較大的機會比第一或第三字串先出現，也就是所謂的後選優勢是否存在。利用電腦計算，我們發現字串長度為4,5,6時後選優勢確實存在，而當字串長度大於等於或等於7時，我們則證明了若第一字串為(0,0,...,0),(0,0,...,0,1),(1,1,...,1)或(1,1,...,1,0)時，後選者優勢亦存在。 / In the thesis, we consider about the first occurrence of four strings decided sequentially with teams. Team 1 consists string 1 and string 3; team 2 consists string 2 and string 4. It is interested in whether or not team 2 whose strings are decided after first string and third string are given separately gets an advantage in appearing with larger probability.Namely, given any string 1, we want to find a string 2 such that any string 3 corresponds to at least one string (string 4) making a larger probability for team 2 in appearing earlier than team 1. Based on the result from computer calculation, team 2 advantage over team 1 when the string length is 4, 5, and 6. This thesis also shows that team 2 gets an advantage for cases where string 1 is (0,0,...,0), (0,0,...,0,1), (1,1,...,1), (1,1,...,1,0) ,when the string length is larger than 6. 字串等候時間馬可夫鏈 string waiting time markov chain
116	Vienos Markovo grandinės stacionaraus skirstinio uodegos vertinimas / Estimating the tail of the stationary distribution of one markov chain Skorniakova, Aušra 04 July 2014 (has links) Šiame darbe nagrinėta tam tikra asimptotiškai homogeninė Markovo grandinė ir rasta jos stacionaraus skirstinio uodegos asimptotika. Nagrinėta grandinė negali būti ištirta šiuo metu žinomais metodais, todėl darbas turi praktinę reikšmę. Spręstas uždavinys aktualus sunkių uodegų analizėje. / In this work we have investigated some asymptotically homogeneous Markov chain and found asymptotics of the stationary distribution tail. To our best knowledge, considered chain cannot be investigated by means of existing methods, hence obtained results have practical value. Solved problem is relevant in heavy tail analysis. Tail asymptotics Stationary distribution Markov chain Uodegos asimptotika Stacionarus skirstinys Markovo grandinė
117	Bayesian Inference in Large-scale Problems Johndrow, James Edward January 2016 (has links) <p>Many modern applications fall into the category of "large-scale" statistical problems, in which both the number of observations n and the number of features or parameters p may be large. Many existing methods focus on point estimation, despite the continued relevance of uncertainty quantification in the sciences, where the number of parameters to estimate often exceeds the sample size, despite huge increases in the value of n typically seen in many fields. Thus, the tendency in some areas of industry to dispense with traditional statistical analysis on the basis that "n=all" is of little relevance outside of certain narrow applications. The main result of the Big Data revolution in most fields has instead been to make computation much harder without reducing the importance of uncertainty quantification. Bayesian methods excel at uncertainty quantification, but often scale poorly relative to alternatives. This conflict between the statistical advantages of Bayesian procedures and their substantial computational disadvantages is perhaps the greatest challenge facing modern Bayesian statistics, and is the primary motivation for the work presented here. </p><p>Two general strategies for scaling Bayesian inference are considered. The first is the development of methods that lend themselves to faster computation, and the second is design and characterization of computational algorithms that scale better in n or p. In the first instance, the focus is on joint inference outside of the standard problem of multivariate continuous data that has been a major focus of previous theoretical work in this area. In the second area, we pursue strategies for improving the speed of Markov chain Monte Carlo algorithms, and characterizing their performance in large-scale settings. Throughout, the focus is on rigorous theoretical evaluation combined with empirical demonstrations of performance and concordance with the theory.</p><p>One topic we consider is modeling the joint distribution of multivariate categorical data, often summarized in a contingency table. Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. In Chapter 2, we derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions.</p><p>Latent class models for the joint distribution of multivariate categorical, such as the PARAFAC decomposition, data play an important role in the analysis of population structure. In this context, the number of latent classes is interpreted as the number of genetically distinct subpopulations of an organism, an important factor in the analysis of evolutionary processes and conservation status. Existing methods focus on point estimates of the number of subpopulations, and lack robust uncertainty quantification. Moreover, whether the number of latent classes in these models is even an identified parameter is an open question. In Chapter 3, we show that when the model is properly specified, the correct number of subpopulations can be recovered almost surely. We then propose an alternative method for estimating the number of latent subpopulations that provides good quantification of uncertainty, and provide a simple procedure for verifying that the proposed method is consistent for the number of subpopulations. The performance of the model in estimating the number of subpopulations and other common population structure inference problems is assessed in simulations and a real data application.</p><p>In contingency table analysis, sparse data is frequently encountered for even modest numbers of variables, resulting in non-existence of maximum likelihood estimates. A common solution is to obtain regularized estimates of the parameters of a log-linear model. Bayesian methods provide a coherent approach to regularization, but are often computationally intensive. Conjugate priors ease computational demands, but the conjugate Diaconis--Ylvisaker priors for the parameters of log-linear models do not give rise to closed form credible regions, complicating posterior inference. In Chapter 4 we derive the optimal Gaussian approximation to the posterior for log-linear models with Diaconis--Ylvisaker priors, and provide convergence rate and finite-sample bounds for the Kullback-Leibler divergence between the exact posterior and the optimal Gaussian approximation. We demonstrate empirically in simulations and a real data application that the approximation is highly accurate, even in relatively small samples. The proposed approximation provides a computationally scalable and principled approach to regularized estimation and approximate Bayesian inference for log-linear models. </p><p>Another challenging and somewhat non-standard joint modeling problem is inference on tail dependence in stochastic processes. In applications where extreme dependence is of interest, data are almost always time-indexed. Existing methods for inference and modeling in this setting often cluster extreme events or choose window sizes with the goal of preserving temporal information. In Chapter 5, we propose an alternative paradigm for inference on tail dependence in stochastic processes with arbitrary temporal dependence structure in the extremes, based on the idea that the information on strength of tail dependence and the temporal structure in this dependence are both encoded in waiting times between exceedances of high thresholds. We construct a class of time-indexed stochastic processes with tail dependence obtained by endowing the support points in de Haan's spectral representation of max-stable processes with velocities and lifetimes. We extend Smith's model to these max-stable velocity processes and obtain the distribution of waiting times between extreme events at multiple locations. Motivated by this result, a new definition of tail dependence is proposed that is a function of the distribution of waiting times between threshold exceedances, and an inferential framework is constructed for estimating the strength of extremal dependence and quantifying uncertainty in this paradigm. The method is applied to climatological, financial, and electrophysiology data. </p><p>The remainder of this thesis focuses on posterior computation by Markov chain Monte Carlo. The Markov Chain Monte Carlo method is the dominant paradigm for posterior computation in Bayesian analysis. It has long been common to control computation time by making approximations to the Markov transition kernel. Comparatively little attention has been paid to convergence and estimation error in these approximating Markov Chains. In Chapter 6, we propose a framework for assessing when to use approximations in MCMC algorithms, and how much error in the transition kernel should be tolerated to obtain optimal estimation performance with respect to a specified loss function and computational budget. The results require only ergodicity of the exact kernel and control of the kernel approximation accuracy. The theoretical framework is applied to approximations based on random subsets of data, low-rank approximations of Gaussian processes, and a novel approximating Markov chain for discrete mixture models.</p><p>Data augmentation Gibbs samplers are arguably the most popular class of algorithm for approximately sampling from the posterior distribution for the parameters of generalized linear models. The truncated Normal and Polya-Gamma data augmentation samplers are standard examples for probit and logit links, respectively. Motivated by an important problem in quantitative advertising, in Chapter 7 we consider the application of these algorithms to modeling rare events. We show that when the sample size is large but the observed number of successes is small, these data augmentation samplers mix very slowly, with a spectral gap that converges to zero at a rate at least proportional to the reciprocal of the square root of the sample size up to a log factor. In simulation studies, moderate sample sizes result in high autocorrelations and small effective sample sizes. Similar empirical results are observed for related data augmentation samplers for multinomial logit and probit models. When applied to a real quantitative advertising dataset, the data augmentation samplers mix very poorly. Conversely, Hamiltonian Monte Carlo and a type of independence chain Metropolis algorithm show good mixing on the same dataset.</p> / Dissertation Statistics Bayesian big data contingency table high-dimensional Markov chain Monte Carlo tail dependence
118	Prostorový bodový proces s interakcemi / Spatial point process with interactions Vícenová, Barbora January 2015 (has links) This thesis deals with the estimation of model parameters of the interacting segments process in plane. The motivation is application on the system of stress fibers in human mesenchymal stem cells, which are detected by fluorescent microscopy. The model of segments is defined as a spatial Gibbs point process with marks. We use two methods for parameter estimation: moment method and Takacs-Fiksel method. Further, we implement algorithm for these estimation methods in software Mathematica. Also we are able to simulate the model structure by Markov Chain Monte Carlo, using birth-death process. Numerical results are presented for real and simulated data. Match of model and data is considered by descriptive statistics. Powered by TCPDF (www.tcpdf.org)
119	Clustering of Driver Data based on Driving Patterns Kabra, Amit January 2019 (has links) Data analysis methods are important to analyze the ever-growing enormous quantity of the high dimensional data. Cluster analysis separates or partitions the data into disjoint groups such that data in the same group are similar while data between groups are dissimilar. The focus of this thesis study is to identify natural groups or clusters of drivers using the data which is based on driving style. In finding such a group of drivers, evaluation of the combinations of dimensionality reduction and clustering algorithms is done. The dimensionality reduction algorithms used in this thesis are Principal Component Analysis (PCA) and t-distributed stochastic neighbour embedding (t-SNE). The clustering algorithms such as K-means Clustering and Hierarchical Clustering are selected after performing Literature Review. In this thesis, the evaluation of PCA with K-means, PCA with Hierarchical Clustering, t-SNE with K-means and t-SNE with Hierarchical Clustering is done. The evaluation was done on the Volvo Cars’ drivers dataset based on their driving styles. The dataset is normalized first and Markov Chain of driving styles is calculated. This Markov Chain dataset is of very high dimensions and hence dimensionality reduction algorithms are applied to reduce the dimensions. The reduced dimensions dataset is used as an input to selected clustering algorithms. The combinations of algorithms are evaluated using performance metrics like Silhouette Coefficient, Calinski-Harabasz Index and DaviesBouldin Index. Based on experiment and analysis, the combination of t-SNE and K-means algorithms is found to be the best in comparison to other combinations of algorithms in terms of all performance metrics and is chosen to cluster the drivers based on their driving styles. Clustering Driving Patterns Markov Chain Cars Machine Learning Computer Sciences Datavetenskap (datalogi)
120	Practice-driven solutions for inventory management problems in data-scarce environments Wang, Le 03 June 2019 (has links) Many firms are challenged to make inventory decisions with limited data, and high customer service level requirements. This thesis focuses on heuristic solutions for inventory management problems in data-scarce environments, employing rigorous mathematical frameworks and taking advantage of the information that is available in practice but often ignored in literature. We define a class of inventory models and solutions with demonstrable value in helping firms solve these challenges. Management Data-driven solution Heuristic Inventory management Markov chain Monte Carlo method Sampling

Search results