• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 2
  • 2
  • 1
  • Tagged with
  • 6
  • 6
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Bayesian Analysis of Binary Sales Data for Several Industries

Chen, Zhilin 30 April 2015 (has links)
The analysis of big data is now very popular. Big data may be very important for companies, societies or even human beings if we can take full advantage of them. Data scientists defined big data with four Vs: volume, velocity, variety and veracity. In a short, the data have large volume, grow with high velocity, represent with numerous varieties and must have high quality. Here we analyze data from many sources (varieties). In small area estimation, the term ``big data' refers to numerous areas. We want to analyze binary for a large number of small areas. Then standard Markov Chain Monte Carlo methods (MCMC) methods do not work because the time to do the computation is prohibitive. To solve this problem, we use numerical approximations. We set up four methods which are MCMC, method based on Beta-Binomial model, Integrated Nested Normal Approximation Model (INNA) and Empirical Logistic Transform (ELT) method. We compare the processing time and accuracies of these four methods in order to find the fastest and reasonable accurate one. Last but not the least, we combined the empirical logistic transform method, the fastest and accurate method, with time series to explore the sales data over time.
2

品種重複的無母數估計 / Nonparametric Estimation of Species Overlap

林逢章, Lin, Feng-Chang Unknown Date (has links)
關於描述兩個觀察地A和B相似的程度而言,生物品種是否相同是其中的一個切入點,因此品種重複(species overlap)便為描述兩觀察地相似度的一種指標。就一般的生物或生態研究而言,較常使用的品種重複指數為以品種數為計算基礎的 Jaccard index,公式為 ,其中 和 分別為觀察地A和B的總品種數,而 則為兩地的共同品種數,這樣的計算方式為Gower(1985) 歸類描述兩單位(unit)的相似度(similarity)中的一種。在我們的研究中,將令依觀察到的品種數及品種重複數所計算出的 Jaccard index 視為估計值,記為 ;若描述相似度時僅以品種為計算單位,而忽略個別品種的數量未免有資訊流失的情形,因此我們延伸 Jaccard index 指數而另立以個別品種數為計算單位的 N 指數,並以無母數最大概似估計法(Nonparametric Maximum Likelihood Estimator, NPMLE)估計 N 指數,記為 。另外,Smith, Solow 和 Preston (1996) 也提出利用 delta-beta-binomial 模型修正 Jaccard index 的低估(underestimate)情形,我們將此模型所推估的品種重複記為 ,因此我們的研究重點便在於以模擬實驗比較 、 和 在估計真正參數時的行為。 在模擬實驗中,根據蒙地卡羅(Monte-Carlo)模擬法則,我們設計6種品種發生機率相等的平衡母體,及12種品種發生機率服從幾何分配的不平衡母體,以500次抽樣所得的平均數及標準差決定估計的好壞。根據研究結果,若在已知母體為平衡母體的情形之下, 和 有不錯的估計;而 則是不管在平衡母體或不平衡母體皆有不錯的估計,但 和 在某些不平衡母體時,卻有極偏差的估計。 除了模擬實驗之外,我們並推導出 的期望值和變異數,並證明其為 N 指數的大樣本不偏估計值(asymptotic unbiased estimator),並以台灣西北部濕地的鳥類記錄為實例,計算出三個估計值,並以跋靴法(Bootstrapping)計算出三個估計量的標準差估計值,發現NPMLE 有最小的變異程度。 / In describing the similarity between communities A and B, species overlap is one kind of measure. In ecology and biology, the Jaccard index (Gower, 1985) ,denoted , for species overlap is widely used and is useded as an estimation in our research. However, the Jaccard index is simply the proportion of overlapping species, that is those species appearing in more than one community, to unique species, that is those species appearing in only one community. However, this index ignores species proportion information, assigning equal weight to all species. We propose a new index, N, which includes proportion information and is estimated by a Nonparametric Maximum Likelihood Estimator (NPMLE), denoted . Smith et al. (1996) proposed a delta-beta-binomial model to improve underestimation of the Jaccard index, we denoted this estimator . In our Monte-Carlo simulations, we design 6 balanced populations in which every species has an equal proportion and 12 unbalanced populations in which species proportions follow a geometric distribution. We found that and are accurate for balanced populations but overestimate or underestimate the true value for some unbalanced populations. However, is robust for both balanced and unbalanced populations. In addition to simulation results, we also give theoretical results, which prove some asymptotic properties of NPMLE .For example, species abundance of wild birds communications occurred at two locations in north-western Taiwan.Via bootstrapping, has smaller standard error than and .
3

Definição do nível de significância em função do tamanho amostral / Setting the level of significance depending on the sample size

Oliveira, Melaine Cristina de 28 July 2014 (has links)
Atualmente, ao testar hipóteses utiliza-se como convenção um valor fixo (normalmente 0,05) para o Erro Tipo I máximo aceitável (probabilidade de Rejeitar H0 dado que ela é verdadeira) , também conhecido como nível de significância do teste de hipóteses proposto, representado por alpha. Na maioria das vezes nem se chega a calcular o Erro tipo II ou beta (probabilidade de Aceitar H0 dado que ela é falsa). Tampouco costuma-se questionar se o alpha adotado é razoável para o problema analisado ou mesmo para o tamanho amostral apresentado. Este texto visa levar à reflexão destas questões. Inclusive sugere que o nível de significância deve ser função do tamanho amostral. Ao invés de fixar-se um nível de significância único, sugerimos fixar a razão de gravidade entre os erros tipo I e tipo II baseado nas perdas incorridas em cada caso e assim, dado um tamanho amostral, definir o nível de significância ideal que minimiza a combinação linear dos erros de decisão. Mostraremos exemplos de hipóteses simples, compostas e precisas para a comparação de proporções, da forma mais convencionalmente utilizada comparada com a abordagem bayesiana proposta. / Usually the significance level of the hypothesis test is fixed (typically 0.05) for the maximum acceptable Type I error (probability of Reject H0 as it is true), also known as the significance level of the hypothesis test, represented here by alpha. Normally the type II error or beta (probability of Accept H0 as it is false) is not calculed. Nor often wonder whether the alpha adopted is reasonable for the problem or even analyzed for the sample size presented. This text aims to take the reflection of these issues. Even suggests that the significance level should be a function of the sample size. Instead of fix a unique level of significance, we suggest fixing the ratio of gravity between type I and type II errors based on losses incurred in each case and so, given a sample size, set the ideal level of significance that minimizes the linear combination of the decision errors. There are examples of simple, composite and sharp hypotheses for the comparison of proportions, the more conventionally used form compared with the Bayesian approach proposed.
4

Definição do nível de significância em função do tamanho amostral / Setting the level of significance depending on the sample size

Melaine Cristina de Oliveira 28 July 2014 (has links)
Atualmente, ao testar hipóteses utiliza-se como convenção um valor fixo (normalmente 0,05) para o Erro Tipo I máximo aceitável (probabilidade de Rejeitar H0 dado que ela é verdadeira) , também conhecido como nível de significância do teste de hipóteses proposto, representado por alpha. Na maioria das vezes nem se chega a calcular o Erro tipo II ou beta (probabilidade de Aceitar H0 dado que ela é falsa). Tampouco costuma-se questionar se o alpha adotado é razoável para o problema analisado ou mesmo para o tamanho amostral apresentado. Este texto visa levar à reflexão destas questões. Inclusive sugere que o nível de significância deve ser função do tamanho amostral. Ao invés de fixar-se um nível de significância único, sugerimos fixar a razão de gravidade entre os erros tipo I e tipo II baseado nas perdas incorridas em cada caso e assim, dado um tamanho amostral, definir o nível de significância ideal que minimiza a combinação linear dos erros de decisão. Mostraremos exemplos de hipóteses simples, compostas e precisas para a comparação de proporções, da forma mais convencionalmente utilizada comparada com a abordagem bayesiana proposta. / Usually the significance level of the hypothesis test is fixed (typically 0.05) for the maximum acceptable Type I error (probability of Reject H0 as it is true), also known as the significance level of the hypothesis test, represented here by alpha. Normally the type II error or beta (probability of Accept H0 as it is false) is not calculed. Nor often wonder whether the alpha adopted is reasonable for the problem or even analyzed for the sample size presented. This text aims to take the reflection of these issues. Even suggests that the significance level should be a function of the sample size. Instead of fix a unique level of significance, we suggest fixing the ratio of gravity between type I and type II errors based on losses incurred in each case and so, given a sample size, set the ideal level of significance that minimizes the linear combination of the decision errors. There are examples of simple, composite and sharp hypotheses for the comparison of proportions, the more conventionally used form compared with the Bayesian approach proposed.
5

國民中學基本學力測驗量尺分數之研究

蔡雅瑩, Tsai, Ya-Ying Unknown Date (has links)
國中基本學力測驗自民國九十年實施,今年邁入第五年,學力測驗量尺分數之理論基礎,主要以Lord所提出之強真分數模式(Strong True Score Model)為主,以適當的模式對考生答對題數資料進行轉換,進一步計算出現在的量尺分數;然而國內針對學力測驗之量尺分數所做之相關研究甚少,因此本研究在第一部份先描述量尺分數轉換之理論,並詳細描述資料轉換之過程與方法。 目前學力測驗小組所採用之模式有上限模式與下限模式兩種,以考生答對題數資料估計出此兩種模式,以較低的均方差決定採用之模式,再接下來進行數據轉換之工作,然而此兩種模式都只適用於描述單峰分佈的考生答對題數之分配,英語科在歷年的學測資料,皆呈現之雙峰分佈之情形,故本研究提出混合雙峰模式,該模式較現有學力測驗之上限模式、下限模式更能清楚描述考生答對題數之分配,且具有較低之均方差。 本研究亦對量尺分數之轉換過程提出新的見解,本研究資料顯示在指定量尺平均數時,採用平均數之加權數,會比現行各科量尺指定平均分數皆固定為30分來得好,且具有較低之量尺標準差。 最後本研究總結上述方法與結論,對量尺分數的計算方式提出新的流程,以供學力測驗小組以及學者專家作為研究之參考。
6

Analysis of individual feminine cycle hormone profiles for assessment of luteal defect / Analyse des profils hormonaux du cycle féminin pour l’étude de la déficience lutéale

Abdullah, Saman 10 September 2018 (has links)
Les niveaux hormonaux peuvent varier grandement entre cycles menstruels et entre femmes aux cycles dits « normaux ». Outre les niveaux quotidiens, ces profils présentent une grande diversité d’amplitudes, de durées, de positions et de formes. Ces constats ont ravivé l'intérêt pour l'étude des profils individuels plutôt que généraux. En effet, les profils de la littérature sont des moyennes dont peuvent s’éloigner plusieurs profils individuels ; d’où la nécessité de descriptions plus précises.Dans cette thèse, nous explorons la diversité des profils hormonaux au cours de la phase lutéale du cycle et présentons un concept original pour caractériser la plupart des ondes hormonales avec quatre paramètres seulement. Cela a été obtenu via une distribution bêta-binomiale. De plus, nous proposons un nouveau modèle de régression où le profil hormonal est variable dépendante et une variété de variables binaires ou continues sont prédicteurs.La méthode a été appliquée pour décrire les profils hormonaux de la phase lutéale et a donné des résultats intéressants. Un continuum allant de la phase lutéale normale à la déficience lutéale serait plus approprié qu’une classification binaire (normale/anormale). Les données analysées ont montré qu’un petit follicule a un impact négatif sur la qualité de la phase lutéale et qu’un niveau élevé de PDG perivulatoire (i.e., une lutéinisation prématurée) semble préjudiciable à la phase lutéale. Un niveau de PDG lutéale normal puis faible est probablement un signe d'anomalie de la phase lutéale. De plus, au cours de la phase lutéale, divers profils de métabolites de la progestérone sont corrélés avec plusieurs caractéristiques des femmes et du cycle / Even in normally cycling women, hormone levels vary widely between cycles and between women. Beyond day-by-day levels, hormone profiles do display a great variety of heights, durations, locations, and shapes. These observations have renewed the interest in the assessment of individual rather than general hormone profiles. Actually, as reported by the literature, cycle hormone profiles are averages of many individual profiles but individual profiles may be far from matching these averages. This raises the need for sharper descriptions.In this thesis, we explore the diversity of hormonal profiles observed during the luteal phase of the menstrual cycle and present an original concept to characterize most hormone waves using only four parameters. This was obtained via a beta-binomial distribution. Moreover, we propose a new regression model that considers the hormonal profile as dependent variable and a variety of binary or continuous variables as predictors.We applied the method to describe hormone profiles during the luteal phase and obtained interesting results. Instead of a binary classification (normal/abnormal), it would be more appropriate to consider a continuum from normal luteal phase to luteal deficiency. In the analyzed dataset, a small follicle had a negative impact on the quality of the luteal phase and a high periovulatory PDG level (i.e., a premature luteinization) seemed detrimental to the luteal phase. The occurrence of a normal then low luteal PDG level is probably a potential sign of luteal phase abnormality. Furthermore, distinct progesterone metabolite profiles during the luteal phase were found correlated with several women and cycle characteristics

Page generated in 0.0578 seconds