• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 35
  • 15
  • 5
  • 3
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 286
  • 286
  • 101
  • 98
  • 81
  • 69
  • 68
  • 46
  • 39
  • 38
  • 37
  • 37
  • 35
  • 32
  • 30
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
61

The Prior Distribution in Bayesian Statistics

Chen, Kai-Tang 01 May 1979 (has links)
A major problem associated with Bayesian estimation is selecting the prior distribution. The more recent literature on the selection of the prior is reviewed. Very little of a general nature on the selection of the prior is formed in the literature except for non-informative priors. This class of priors is seen to have limited usefulness. A method of selecting an informative prior is generalized in this thesis to include estimation of several parameters using a multivariate prior distribution. The concepts required for quantifying prior information is based on intuitive principles. In this way, it can be understood and controlled by the decision maker (i.e., those responsible for the consequences) rather than analysts. The information required is: (1) prior point estimates of the parameters being estimated and (2) an expression of the desired influence of the prior relative to the present data in determining the parameter estimates (e.g., item (2) implies twice as much influence as the data). These concepts (point estimates and influence) may be used equally with subjective or quantitative prior information.
62

Unbalanced Analysis of Variance Comparing Standard and Proposed Approximation Techniques for Estimating the Variance Components

Pugsley, James P. 01 May 1984 (has links)
This paper considers the estimation of the components of variation for a two-factor unbalanced nested design and compares standard techniques with proposed approximation procedures. Current procedures are complicated and assume the unbalanced sample size to be fixed. This paper tests some simpler techniques, assuming sample sizes are random variables. Monte Carlo techniques were used to generate data for testing of these new procedures.
63

Design Optimization Using Model Estimation Programming

Brimhall, Richard Kay 01 May 1967 (has links)
Model estimation programming provides a method for obtaining extreme solutions subject to constraints. Functions which are continuous with continuous first and second derivatives in the neighborhood of the solution are approximated using quadratic polynomials (termed estimating functions) derived from computed or experimental data points. Using the estimating functions, an approximation problem is solved by a numerical adaptation of the method of Lagrange. The method is not limited by the concavity of the objective function. Beginning with an initial array of data observations, an initial approximate solution is obtained. Using this approximate solution as a new datum point, the coefficients for the estimating function are recalculated with a constrained least squares fit which forces intersection of the functions and their estimating functions at the last three observations. The constraining of the least squares estimate provides a sequence of approximate solutions which converge to the desired extremal. A digital computer program employing the technique is used extensively by Thiokol Chemical Corporation's Wasatch Division, especially for vehicle design optimization where flight performance and hardware constraints must be satisfied simultaneously.
64

Multicollinearity and the Estimation of Regression Coefficients

Teed, John Charles 01 May 1978 (has links)
The precision of the estimates of the regression coefficients in a regression analysis is affected by multicollinearity. The effect of certain factors on multicollinearity and the estimates was studied. The response variables were the standard error of the regression coefficients and a standarized statistic that measures the deviation of the regression coefficient from the population parameter. The estimates are not influenced by any one factor in particular, but rather some combination of factors. The larger the sample size, the better the precision of the estimates no matter how "bad" the other factors may be. The standard error of the regression coefficients proved to be the best indication of estimation problems.
65

Parameter Estimation for Generalized Pareto Distribution

Lin, Der-Chen 01 May 1988 (has links)
The generalized Pareto distribution was introduced by Pickands (1975). Three methods of estimating the parameters of the generalized Pareto distribution were compared by Hosking and Wallis (1987) . The methods are maximum likelihood, method of moments and probability-weighted moments. An alternate method of estimation for the generalized Pareto distribution, based on least square regression of expected order statistics (REOS), is developed and evaluated in this thesis . A Monte Carlo comparison is made between this method and the estimating methods considered by Hosking and Wallis (1987). This method is shown to be generally superior to the maximum likelihood, method of moments and probability-weighted moments
66

Adaptive Stochastic Gradient Markov Chain Monte Carlo Methods for Dynamic Learning and Network Embedding

Tianning Dong (14559992) 06 February 2023 (has links)
<p>Latent variable models are widely used in modern data science for both statistic and dynamic data. This thesis focuses on large-scale latent variable models formulated for time series data and static network data. The former refers to the state space model for dynamic systems, which models the evolution of latent state variables and the relationship between the latent state variables and observations. The latter refers to a network decoder model, which map a large network into a low-dimensional space of latent embedding vectors. Both problems can be solved by adaptive stochastic gradient Markov chain Monte Carlo (MCMC), which allows us to simulate the latent variables and estimate the model parameters in a simultaneous manner and thus facilitates the down-stream statistical inference from the data. </p> <p><br></p> <p>For the state space model, its challenge is on inference for high-dimensional, large scale and long series data. The existing algorithms, such as particle filter or sequential importance sampler, do not scale well to the dimension of the system and the sample size of the dataset, and often suffers from the sample degeneracy issue for long series data. To address the issue, the thesis proposes the stochastic approximation Langevinized ensemble Kalman filter (SA-LEnKF) for jointly estimating the states and unknown parameters of the dynamic system, where the parameters are estimated on the fly based on the state variables simulated by the LEnKF under the framework of stochastic approximation MCMC. Under mild conditions, we prove its consistency in parameter estimation and ergodicity in state variable simulations. The proposed algorithm can be used in uncertainty quantification for long series, large scale, and high-dimensional dynamic systems. Numerical results on simulated datasets and large real-world datasets indicate its superiority over the existing algorithms, and its great potential in statistical analysis of complex dynamic systems encountered in modern data science. </p> <p><br></p> <p>For the network embedding problem, an appropriate embedding dimension is hard to determine under the theoretical framework of the existing methods, where the embedding dimension is often considered as a tunable hyperparameter or a choice of common practice. The thesis proposes a novel network embedding method with a built-in mechanism for embedding dimension selection. The basic idea is to treat the embedding vectors as the latent inputs for a deep neural network (DNN) model. Then by an adaptive stochastic gradient MCMC algorithm, we can simulate of the embedding vectors and estimate the parameters of the DNN model in a simultaneous manner. By the theory of sparse deep learning, the embedding dimension can be determined via imposing an appropriate sparsity penalty on the DNN model. Experiments on real-world networks show that our method can perform dimension selection in network embedding and meanwhile preserve network structures. </p> <p><br></p>
67

<b>Sample Size Determination for Subsampling in the Analysis of Big Data, Multiplicative models for confidence intervals and Free-Knot changepoint models</b>

Sheng Zhang (18468615) 11 June 2024 (has links)
<p dir="ltr">We studied the relationship between subsample size and the accuracy of resulted estimation under big data setup.</p><p dir="ltr">We also proposed a novel approach to the construction of confidence intervals based on improved concentration inequalities.</p><p dir="ltr">Lastly, we studied irregular change-point models using free-knot splines.</p>
68

Qwixx Strategies Using Simulation and MCMC Methods

Blank, Joshua W 01 June 2024 (has links) (PDF)
This study explores optimal strategies for maximizing scores and winning in the popular dice game Qwixx, analyzing both single and multiplayer gameplay scenarios. Through extensive simulations, various strategies were tested and compared, including a scorebased approach that uses a formula tuned by MCMC random walks, and race-to-lock approaches which use absorbing Markov chain qualities of individual score sheet rows to find ways to lock rows as quickly as possible. Results indicate that employing a scorebased strategy, considering gap, count, position, skip, and likelihood scores, significantly improves performance in single player games, while move restrictions based on specific dice roll sums in the race-to-lock strategy were found to enhance winning and scoring points in multiplayer games. While the results do not achieve the optimal scores attained by prior informal work, the study provides valuable insights into decision-making processes and gameplay optimization for Qwixx enthusiasts, offering practical guidance for players seeking to enhance their performance and strategic prowess in the game. It also serves as a lesson for how to approach optimization problems in the future.
69

Finding a Representative Distribution for the Tail Index Alpha, α, for Stock Return Data from the New York Stock Exchange

Burns, Jett 01 May 2022 (has links)
Statistical inference is a tool for creating models that can accurately display real-world events. Special importance is given to the financial methods that model risk and large price movements. A parameter that describes tail heaviness, and risk overall, is α. This research finds a representative distribution that models α. The absolute value of standardized stock returns from the Center for Research on Security Prices are used in this research. The inference is performed using R. Approximations for α are found using the ptsuite package. The GAMLSS package employs maximum likelihood estimation to estimate distribution parameters using the CRSP data. The distributions are selected by using AIC and worm plots. The Skew t family is found to be representative for the parameter α based on subsets of the CRSP data. The Skew t type 2 distribution is robust for multiple subsets of values calculated from the CRSP stock return data.
70

Software Profiling of Rogue Events in High-Volume Gauging

Bering, Thomas P.K. 10 1900 (has links)
Customers are placing ever increasing demands on automotive part manufacturers for high quality parts at low cost. Increasingly, the demand is for zero defects or defect rates in the less than one part per billion. This creates a significant challenge for manufacturers as to how to achieve these low defect levels economically while producing large volumes of parts. Importantly, the presence of infrequent process and measurement (gauge) events can adversely affect product quality. This thesis uses a statistical mixture model that allows one to assume a main production process that occurs most of the time, and secondary rogue events that occur infrequently. Often the rogue events correspond to necessary operator activity, like equipment repairs and tooling replacement. The mixture model predicts that some gauge observations will be influenced by combinations of these rogue events. Certain production applications, like those involving feedback or high-reliability gauging, are heavily influenced by rogue events and combinations of rogue events. A special runtime software profiler was created to collect information about rogue events, and statistical techniques (rogue event analysis) were used to estimate the waste generated by these rogue events. The value of these techniques was successfully demonstrated in three different industrial automotive part production applications. Two of these systems involve an automated feedback application with Computer Numerically Controlled (CNC) machining centers and Coordinate Measuring Machine (CMM) gauges. The third application involves a high-reliability inspection system that used optical, camera-based, machine-vision technology. The original system accepted reject parts at a rate of 98.7 part per million (ppm), despite multiple levels of redundancy. The final system showed no outgoing defects on a 1 million part factory data sample, and a 100 million part simulated data sample. It is expected that the final system reliability will meet the 0.001 ppm specification, which represents a huge improvement. / Doctor of Philosophy (PhD)

Page generated in 0.1183 seconds