Global ETD Search

61	ON SEQUENTIAL UNBIASED AND BAYES-TYPE ESTIMATES OF PARAMETERS IN A CONTINGENCY TABLE Unknown Date (has links) Estimation of the probability parameters in a contingency table with linear and/or log-linear constraints on the parameters is the principal concern of this thesis. Sequential unbiased estimates of the cell probabilities as well as some Bayes posterior mean type estimates are considered. / Chapter I is a review of some earlier work on the sequential unbiased estimation of the probability parameter in a Bernoulli process. The review begins with the classical work of Girshick, Mosteller and Savage (1946) and some follow-up studies like Wolfowitz (1946), Savage (1947), Blackwell (1947), Lehmann and Stein (1950), Degroot (1959) and Kagan, Linnik and Rao (1973). In several cases the original proofs have been simplified and the arguments streamlined. / Chapter II deals with the problem of sequential unbiased estimation of the parameters in a contingency table with linear and/or log-linear constraints. Multinomial Girschick, Mosteller and Savage (GMS) type stopping rules are discussed and the corresponding unbiased estimates based on the minimal sufficient statistic described. Consistency, in the sence of Wolfowitz (1947), of such estimates is demonstrated. Unbiased estimates of parametric functions like log-contrasts are derived. Sufficient conditions for the completeness of the GMS-type stopping rules are given. / In Chapter III, the problem of sequential unbiased estimation of the probability parameters in the Bradley-Terry (1952) model of paired comparisons is studied.g The Bradley-Terry model can be summarized as follows. Suppose that there are t treatments T(,1), ..., T(,t) that can be pairwise compared. The Bradley-Terry model postulates that associated with treatement T(,i) is a :strenth" parameter (PI)(,i) > 0, i = 1, ..., t, such that if treatments T(,i) and T(,j) are compared, the probability that T(,i) is preferred to T(,j) is (theta)(,ij) = (PI)(,i)/((PI)(,i) + (PI)(,j)). The model imposes log-linear constraints on he (theta)(,ij)'s so that techniques similar to those in Chapter II may be used to obtain unbiased estimates, based on a sufficient statistic. / In Chapter IV, two Bayes-type procedures for estimating the multnomial cell probability vector p, in he presence of linear constraints on the parameters, are proposed and illustrated with examples. A general prior is used with the restriction that the moment generating function of the prior exists in a closed form. The estimators are shown to be strongly consistent. Estimation under log-linear constraints is also considered. Finally, Bayes-type estimators for the covariance matrix of the cell frequencies are presented for some special cases of linearly and log-linearly constrained problems. / Chapter V is concerned with a Bayesian approach to the estimation of parameters in the Bradley-Terry model of paired comparisons. It is assumed that the sum of the treatment parameters (PI)(,i) is 1, and a Dirichlet prior for (PI) = ((PI)(,1), ..., (PI)(,t)) is used. Using the induced prior of (theta)(,ij) and Z(,ij) = (PI)(,i) + (PI)(,j), an estimate (PI)(,ij) of (PI)(,i), based on the data arising from the comparisons of(' ) treatments T(,i) and T(,j), is obtained. An estimate of (PI)(,i) based on all the data is a weighted combination of the (PI)(,ij)'s that minimizes a(' ) risk function. Similarly, estimates for log-contrasts of the (PI)(,i)'s areobtained. This technique of estimation is extended to the Luce-model of multiple comparisons.(,) / Source: Dissertation Abstracts International, Volume: 42-06, Section: B, page: 2436. / Thesis (Ph.D.)--The Florida State University, 1981. Statistics
62	EXTREMAL PROBLEMS IN LARGE DEVIATIONS AND BAYESIAN NONPARAMETRIC FAILURE-RATE ESTIMATION Unknown Date (has links) Source: Dissertation Abstracts International, Volume: 40-09, Section: B, page: 4364. / Thesis (Ph.D.)--The Florida State University, 1979. Statistics
63	CONTRIBUTIONS TO ASYMPTOTIC THEORY FOR STATISTICS AS FUNCTIONALS OF THE EMPIRICAL DENSITY FUNCTION Unknown Date (has links) Source: Dissertation Abstracts International, Volume: 40-09, Section: B, page: 4365. / Thesis (Ph.D.)--The Florida State University, 1979. Statistics
64	PARTIAL SEQUENTIAL TESTS FOR THE MEAN OF A NORMAL DISTRIBUTION Unknown Date (has links) Recently, Billard (1977) introduced a truncated partial sequential procedure for testing a null hypothesis about a normal mean with known variance against a two-sided alternative hypothesis. That procedure had the disadvantage that a large number of observations is necessary if the null hypothesis is to be accepted. A new procedure is introduced which reduces the expected sample size for all mean values with considerable reductions for values near the null mean value. Theoretical operating characteristic and average sample number functions are derived, and the empirical distribution of the sample size in some special cases is obtained. / For the case of unknown variance and a one-sided alternative hypothesis, there are a number of tests, the best known of which are those of Wald (1947) and Barnard (1952). These tests have concerned themselves with tests for units of (mu)/(sigma). In this work, a partial sequential test procedure is introduced for hypotheses concerned only with (mu). An advantage of this new procedure is its relative simplicity and ease of execution when compared to the above tests. This is essentially due to the fact that in the present procedure the transformed observations follow a central t-distribution as distinct from the noncentral t-distribution. The difficulties caused by the noncentral distribution explain the relative lack of progress in obtaining the results about the properties, such as the operating characteristic and average sample number functions, of the tests of Barnard and Wald. The key element in the present procedure is that a number of observations is taken initially before any decision is made; subsequent observations are then taken in batches, the sizes of which depend on the estimate for the variance obtained from the initial set of observations. Some properties of the procedure are studied. In particular, an approximation to the theoretical operating characteristic function is derived and the sensitivity of the average sample number function to changes in some of the test parameters is investigated. / The ideas developed for the partial sequential t-test are extended to develop tests of hypotheses concerning the parameters of a simple linear regression equation, general linear hypotheses and hypotheses about the mean of special cases of the multivariate normal. / Source: Dissertation Abstracts International, Volume: 42-06, Section: B, page: 2435. / Thesis (Ph.D.)--The Florida State University, 1981. Statistics
65	CONTRIBUTIONS TO BAYESIAN NONPARAMETRIC STATISTICS Unknown Date (has links) Source: Dissertation Abstracts International, Volume: 40-09, Section: B, page: 4367. / Thesis (Ph.D.)--The Florida State University, 1979. Statistics
66	ON DETERMINING THE NUMBER OF PREDICTORS IN A REGRESSION EQUATION USED FOR PREDICTION Unknown Date (has links) It is generally recognized that all the available variables should not necessarily be used as predictors in a linear regression equation. The problems which may arise from using too many predictors become especially acute in a regression equation used for prediction with independent data. In this case, the skill of prediction may actually deteriorate with increasing numbers of predictors. However, there is no definitive explanation as to why this should be so. There is also no universally accepted procedure for determining the number of predictors to use. The various regression methods which do exist are logically contrived but are also largely based on subjective considerations. / The goal of this research is to develop and test a criterion that will indicate a priori the "optimum" number of predictors to use in a prediction equation. The mean square error statistic is used to evaluate the performance of a regression equation in both the dependent and independent samples. Selecting the "best" prediction equation consists of determining the equation with the minimum estimated independent sample mean square error. Several approximations and estimators of the independent sample mean square error which have appeared in the literature are discussed and two new estimators are derived. / These approximations and estimators are tested in Monte Carlo simulations to determine their skill in indicating the number of predictors which will yield the best prediction equation. The sample size, number of available predictors, correlations among the variables, distribution of the variables, and selection method are manipulated to explore how these various factors influence the performances of the mean square error estimators. It is found that the better estimators are capable of indicating a number of predictors to include in the regression equation for which the corresponding independent sample mean square error is near the minimum value. / As a practical test, the various estimators of the independent sample mean square error are applied to the data used in deriving the Model Output Statistics (MOS) maximum and minimum temperature forecast equations used by the National Weather Service. These prediction equations are linear regression equations derived using a forward selection method. The sequence of prediction equations corresponding to the forward trace of all the available predictors is derived for each of 192 cases and then applied to independent data. The forecasts made by the operational p = 10 predictor MOS equations are compared with those made by the equations determined by the estimators of the independent sample mean square error. The operational equations have the best overall verification statistics. The estimators persistently underestimate the values of the independent sample mean square error, but one of the new estimators is able to determine MOS forecast equations that perform as well as the operational equations. Furthermore, it is able to accomplish this without the use of an independent sample to help determine the optimum number of predictors. / Source: Dissertation Abstracts International, Volume: 41-05, Section: B, page: 1822. / Thesis (Ph.D.)--The Florida State University, 1980. Statistics
67	TWO-WAY CLUSTER ANALYSIS WITH NOMINAL DATA Unknown Date (has links) Consider an M by N data matrix X whose elements may assume values 0, 1, 2, . . ., H. Denote the rows of X by (alpha)(,1), (alpha)(,2), . . ., (alpha)(,M). A tree on the rows of X is a sequence of distinct partitions {P(,1)}(,i=1) such that: (a) P(,1) = {((alpha)(,1)), . . ., ((alpha)(,M))}, (b) P(,i) is a refinement of P(,i+1) for i = 1, . . ., k-1, and (c) P(,k) = {((alpha)(,1), . . ., (alpha)(,M))}. The two-way clustering problem consists of simultaneously constructing trees on the rows, columns, and elements of X. A generalization of a two-way joining algorithm (TWJA) introduced by J. A. Hartigan (1975) is used to construct the three trees. / The TWJA requires the definition of measures of dissimilarity between row clusters and column clusters respectively. Two approaches are used in the construction of these dissimilarity coefficients--one based on intuition and one based on a formal prediction model. For matrices with binary elements (0 or 1), measures of dissimilarity between row or column clusters are based on the number of mismatching pairs. Consider two distinct row clusters R(,p) and R(,q) containing m(,p) and m(,q) rows respectively. One measure of dissimilarity, d(,0)(R(,p), R(,q)), between R(,p) and R(,q), is / (DIAGRAM, TABLE OR GRAPHIC OMITTED...PLEASE SEE DAI) / where b(,p(beta)) and b(,q(beta)) are the number of ones in column (beta) of clusters R(,p) and R(,q) respectively. Two additional intuitive dissimilarity coefficients are also defined and studied. / For matrices containing nominal level data, dissimilarity coefficients are based on a formal prediction model. Analogous to the procedure of Cleveland and Relles (1974), for a given data matrix, the model consists of a scheme for random selection of two rows (or columns) from the matrix and an identification rule for distinguishing between the two rows (or columns). A loss structure is defined for both rows and columns and the expected loss due to incorrect row or column identification is computed. The dissimilarity between two (say) row clusters is then defined to be the increase in expected loss due to joining those two row clusters into a single cluster. / Stopping criteria are suggested for both the intuitive and prediction model approaches. For the intuitive approach, it is suggested that joining be stopped when the dissimilarity between the (say) row clusters to be joined next exceeds that expected by chance under the assumption that the (say) column totals of the matrix are fixed. For the prediction model approach the stopping criterion is based on a cluster prediction model in which the objective is to distinguish between row or column clusters. A cluster identification rule is defined based on the information in the partitioned data matrix and the expected loss due to incorrect cluster identification is computed. The expected cluster loss is also computed when cluster identification is based on strict randomization. The relative decrease in expected cluster loss due to identification based on the partitioned matrix versus that based on randomization is suggested as a stopping criterion. / Both contrived and real data examples are used to illustrate and compare the two clustering procedures. Computational aspects of the procedure are discussed and it is concluded that the intuitive approach is less costly in terms of computation time. Further, five admissibility properties are defined and, for certain intuitive dissimilarity coefficients, the trees produced by the TWJA are shown to possess three of the five properties. / Source: Dissertation Abstracts International, Volume: 41-05, Section: B, page: 1822. / Thesis (Ph.D.)--The Florida State University, 1980. Statistics
68	AN INVESTIGATION OF OUTLIER DEFINITION AND THE IMPACT OF THE MASKING PHENOMENON ON SEVERAL STATISTICAL OUTLIER TESTS Unknown Date (has links) Source: Dissertation Abstracts International, Volume: 41-01, Section: B, page: 0260. / Thesis (Ph.D.)--The Florida State University, 1980. Statistics
69	BETWEEN-CLASS AND WITHIN-CLASS EFFECTS IN APTITUDE X TREATMENT INTERACTION Unknown Date (has links) Source: Dissertation Abstracts International, Volume: 41-01, Section: B, page: 0259. / Thesis (Ph.D.)--The Florida State University, 1979. Statistics
70	ON NONPARAMETRIC ESTIMATION OF DENSITY AND REGRESSION FUNCTIONS Unknown Date (has links) In the field of statistical estimation, nonparametric procedures have received increased attention for the past decade. In particular, various nonparametric estimates of probability density functions and regression curves have been extensively studied, with special attention to large sample pr / Source: Dissertation Abstracts International, Volume: 41-03, Section: B, page: 1017. / Thesis (Ph.D.)--The Florida State University, 1980. Statistics

Search results