We discuss two-sample problems and the implementation of a new two-sample data
analysis procedure. The proposed procedure is based on the concepts of mid-distribution,
design of score functions, components, comparison distribution, comparison density
and exponential model. Assume that we have a random sample X1, . . . ,Xm from
a continuous distribution F(y) = P(Xi y), i = 1, . . . ,m and a random sample
Y1, . . . ,Yn from a continuous distribution G(y) = P(Yi y), i = 1, . . . ,n. Also assume
independence of the two samples. The two-sample problem tests homogeneity of
two samples and formally can be stated as H0 : F = G. To solve the two-sample problem,
a number of tests have been proposed by statisticians in various contexts. Two
typical tests are the two-sample t?test and the Wilcoxon's rank sum test. However,
since they are testing differences in locations, they do not extract more information
from the data as well as a test of the homogeneity of the distribution functions. Even
though the Kolmogorov-Smirnov test statistic or Anderson-Darling tests can be used
for the test of H0 : F = G, those statistics give no indication of the actual relation
of F to G when H0 : F = G is rejected. Our goal is to learn why it was rejected.
Our approach gives an answer using graphical tools which is a main property of our
approach. Our approach is functional in the sense that the parameters to be estimated
are probability density functions. Compared with other statistical tools for
two-sample problems such as the t-test or the Wilcoxon rank-sum test, density estimation makes us understand the data more fully, which is essential in data analysis.
Our approach to density estimation works with small sample sizes, too. Also our
methodology makes almost no assumptions on two continuous distributions F and
G. In that sense, our approach is nonparametric. Our approach gives graphical elements
in two-sample problem where exist not many graphical elements typically.
Furthermore, our procedure will help researchers to make a conclusion as to why two
populations are different when H0 is rejected and to give an explanation to describe
the relation between F and G in a graphical way.
Identifer | oai:union.ndltd.org:tamu.edu/oai:repository.tamu.edu:1969.1/2653 |
Date | 01 November 2005 |
Creators | Choi, Sujung |
Contributors | Parzen, Emanuel |
Publisher | Texas A&M University |
Source Sets | Texas A and M University |
Language | en_US |
Detected Language | English |
Type | Book, Thesis, Electronic Dissertation, text |
Format | 891742 bytes, electronic, application/pdf, born digital |
Page generated in 0.0019 seconds