1 |
New Nonparametric Tests for Panel Count DataZhao, Xingqiu 04 1900 (has links)
<p> Statistical analysis of panel count data is an important topic to a number of applied fields including biology, engineering, econometrics, medicine, and public health. Panel count data include observations on subjects over multiple time points where the response variable is a count or recurrent event process when only the numbers of events occurring between observation time points are available. The choice of method for analyzing panel count data usually depends on the relationship between the observation times and the response variable and questions of interest. Most of the previous research was done when the observation times are fixed. If the observation times are random, the data structure becomes more challenging since the observation times for individual subjects vary in addition to the incompleteness of observations. The model-based approach was used to deal with such data. However, this method relies on extra assumptions on the observation scheme and thus is restrictive in practice. In this dissertation, we discuss the problem of multi-sample nonparametric comparison of counting processes with panel count data, which arise naturally when recurrent events are considered. For the problem considered, we develop some new nonparametric tests.</p> <p> First, we construct a class of nonparametric test statistics based on the integrated weighted differences between the estimated mean functions of the count processes, where the isotonic regression estimate is used for the mean functions. The asymptotic distributions of the proposed statistics are derived and their finite-sample properties are examined through Monte Carlo simulations. A panel count data from a cancer study is analyzed and presented as an illustrative example.</p> <p>As shown through Monte Carlo simulations, the nonparametric maximum likelihood estimator (NPMLE) of the mean function is more efficient than the nonparametric maximum pseudo-likelihood estimator (NPMPLE). However, no nonparametric tests have been discussed in the literature for panel count data based on the NPMLE since the NPMLE is more complicated both theoretically and computationally. It is, therefore, particularly important to develop nonparametric tests based on the NPMLE for panel count data.</p> <p> In the second part of the dissertation, we focus on the situation when treatment indicators can be regarded as independent and identically distributed random variables and propose a nonparametric test in this case using the maximum likelihood estimator. The asymptotic property of the test statistic is derived. Simulation studies are carried out which suggest that the proposed method works well for practical situations, and is more powerful than the existing tests based on the NPMPLEs of the mean functions.</p> <p>In the third part of the dissertation, we consider more general situations. We construct a class of nonparametric tests based on the accumulated weighted differences between the rates of increase of the estimated mean functions of the counting processes over observation times, where the nonparametric maximum likelihood approach is used to estimate the mean functions instead of the nonparametric maximum pseudolikelihood. The asymptotic distributions of the proposed statistics are derived and their finite-sample properties are evaluated by means of Monte Carlo simulations. The simulation results show that the proposed methods work quite well and the tests based on NPMLE are more powerful than those based on NPMPLE. Two real data sets are analyzed and presented as illustrative examples.</p> <p>The last part of the dissertation discusses a special type of panel count data, namely, current status or case 1 interval-censored data. Such data often occur in tumorigenicity experiments. For nonparametric two-sample comparison based on censored or interval-censored data, most of the existing methods have focused on testing the hypothesis that specifies the two population distributions to be identical under the assumption that observation or censoring times have the same distribution. We consider the nonparametric Behrens-Fisher hypothesis (NBFH) under this settings. For this purpose, we study the asymptotic property of the nonparametric maximum likelihood estimator of the probability that an observation from the first distribution exceeds an observation from the second distribution. A nonparametric test for the NBFH is proposed and the asymptotic normality of the proposed test is established. The method is evaluated using simulation studies and illustrated by a set of real data from a tumorigenicity experiment.</p> / Thesis / Doctor of Philosophy (PhD)
|
2 |
品種重複的無母數估計 / Nonparametric Estimation of Species Overlap林逢章, Lin, Feng-Chang Unknown Date (has links)
關於描述兩個觀察地A和B相似的程度而言,生物品種是否相同是其中的一個切入點,因此品種重複(species overlap)便為描述兩觀察地相似度的一種指標。就一般的生物或生態研究而言,較常使用的品種重複指數為以品種數為計算基礎的 Jaccard index,公式為 ,其中 和 分別為觀察地A和B的總品種數,而 則為兩地的共同品種數,這樣的計算方式為Gower(1985) 歸類描述兩單位(unit)的相似度(similarity)中的一種。在我們的研究中,將令依觀察到的品種數及品種重複數所計算出的 Jaccard index 視為估計值,記為 ;若描述相似度時僅以品種為計算單位,而忽略個別品種的數量未免有資訊流失的情形,因此我們延伸 Jaccard index 指數而另立以個別品種數為計算單位的 N 指數,並以無母數最大概似估計法(Nonparametric Maximum Likelihood Estimator, NPMLE)估計 N 指數,記為 。另外,Smith, Solow 和 Preston (1996) 也提出利用 delta-beta-binomial 模型修正 Jaccard index 的低估(underestimate)情形,我們將此模型所推估的品種重複記為 ,因此我們的研究重點便在於以模擬實驗比較 、 和 在估計真正參數時的行為。
在模擬實驗中,根據蒙地卡羅(Monte-Carlo)模擬法則,我們設計6種品種發生機率相等的平衡母體,及12種品種發生機率服從幾何分配的不平衡母體,以500次抽樣所得的平均數及標準差決定估計的好壞。根據研究結果,若在已知母體為平衡母體的情形之下, 和 有不錯的估計;而 則是不管在平衡母體或不平衡母體皆有不錯的估計,但 和 在某些不平衡母體時,卻有極偏差的估計。
除了模擬實驗之外,我們並推導出 的期望值和變異數,並證明其為 N 指數的大樣本不偏估計值(asymptotic unbiased estimator),並以台灣西北部濕地的鳥類記錄為實例,計算出三個估計值,並以跋靴法(Bootstrapping)計算出三個估計量的標準差估計值,發現NPMLE 有最小的變異程度。 / In describing the similarity between communities A and B, species overlap is one kind of measure. In ecology and biology, the Jaccard index (Gower, 1985) ,denoted , for species overlap is widely used and is useded as an estimation in our research. However, the Jaccard index is simply the proportion of overlapping species, that is those species appearing in more than one community, to unique species, that is those species appearing in only one community. However, this index ignores species proportion information, assigning equal weight to all species. We propose a new index, N, which includes proportion information and is estimated by a Nonparametric Maximum Likelihood Estimator (NPMLE), denoted . Smith et al. (1996) proposed a delta-beta-binomial model to improve underestimation of the Jaccard index, we denoted this estimator .
In our Monte-Carlo simulations, we design 6 balanced populations in which every species has an equal proportion and 12 unbalanced populations in which species proportions follow a geometric distribution. We found that and are accurate for balanced populations but overestimate or underestimate the true value for some unbalanced populations. However, is robust for both balanced and unbalanced populations.
In addition to simulation results, we also give theoretical results, which prove some asymptotic properties of NPMLE .For example, species abundance of wild birds communications occurred at two locations in north-western Taiwan.Via bootstrapping, has smaller standard error than and .
|
Page generated in 0.0175 seconds