• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 13
  • 2
  • 2
  • 1
  • Tagged with
  • 23
  • 23
  • 13
  • 9
  • 6
  • 5
  • 5
  • 5
  • 4
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

A Bayesian nonparametric approach for the two-sample problem / Uma abordagem bayesiana não paramétrica para o problema de duas amostras

Console, Rafael de Carvalho Ceregatti de 19 November 2018 (has links)
In this work, we discuss the so-called two-sample problem Pearson and Neyman (1930) assuming a nonparametric Bayesian approach. Considering X1; : : : ; Xn and Y1; : : : ; Ym two independent i.i.d samples generated from P1 and P2, respectively, the two-sample problem consists in deciding if P1 and P2 are equal. Assuming a nonparametric prior, we propose an evidence index for the null hypothesis H0 : P1 = P2 based on the posterior distribution of the distance d (P1; P2) between P1 and P2. This evidence index has easy computation, intuitive interpretation and can also be justified in the Bayesian decision-theoretic context. Further, in a Monte Carlo simulation study, our method presented good performance when compared with the well known Kolmogorov- Smirnov test, the Wilcoxon test as well as a recent testing procedure based on Polya tree process proposed by Holmes (HOLMES et al., 2015). Finally, we applied our method to a data set about scale measurements of three different groups of patients submitted to a questionnaire for Alzheimer\'s disease diagnostic. / Neste trabalho, discutimos o problema conhecido como problema de duas amostras Pearson and Neyman (1930) utilizando uma abordagem bayesiana não-paramétrica. Considere X1; : : : ; Xn and Y1; : : : ;Ym duas amostras independentes, geradas por P1 e P2, respectivamente, o problema de duas amostras consiste em decidir se P1 e P2 são iguais. Assumindo uma priori não-paramétrica, propomos um índice de evidência para a hipótese nula H0 : P1 = P2 baseado na distribuição a posteriori da distância d (P1; P2) entre P1 e P2. O índice de evidência é de fácil implementação, tem uma interpretação intuitiva e também pode ser justificada no contexto da teoria da decisão bayesiana. Além disso, em um estudo de simulação de Monte Carlo, nosso método apresentou bom desempenho quando comparado com o teste de Kolmogorov-Smirnov, com o teste de Wilcoxon e com o método de Holmes. Finalmente, aplicamos nosso método em um conjunto de dados sobre medidas de escala de três grupos diferentes de pacientes submetidos a um questionário para diagnóstico de doença de Alzheimer.
12

Tests for homogeneity of survival distributions against non-location alternatives and analysis of the gastric cancer data

Bagdonavičius, Vilijandas B., Levuliene, Ruta, Nikulin, Mikhail S., Zdorova-Cheminade, Olga January 2004 (has links)
The two and k-sample tests of equality of the survival distributions against the alternatives including cross-effects of survival functions, proportional and monotone hazard ratios, are given for the right censored data. The asymptotic power against approaching alternatives is investigated. The tests are applied to the well known chemio and radio therapy data of the Gastrointestinal Tumor Study Group. The P-values for both proposed tests are much smaller then in the case of other known tests. Differently from the test of Stablein and Koutrouvelis the new tests can be applied not only for singly but also to randomly censored data.
13

Detecting Disguised Missing Data

Belen, Rahime 01 February 2009 (has links) (PDF)
In some applications, explicit codes are provided for missing data such as NA (not available) however many applications do not provide such explicit codes and valid or invalid data codes are recorded as legitimate data values. Such missing values are known as disguised missing data. Disguised missing data may affect the quality of data analysis negatively, for example the results of discovered association rules in KDD-Cup-98 data sets have clearly shown the need of applying data quality management prior to analysis. In this thesis, to tackle the problem of disguised missing data, we analyzed embedded unbiased sample heuristic (EUSH), demonstrated the methods drawbacks and proposed a new methodology based on Chi Square Two Sample Test. The proposed method does not require any domain background knowledge and compares favorably with EUSH.
14

Efficient Stepwise Procedures for Minimum Effective Dose Under Heteroscedasticity

Wang, Yinna 25 July 2012 (has links)
No description available.
15

A Tree-based Framework for Difference Summarization

Li, Rong 19 April 2012 (has links)
No description available.
16

The Two-Sample t-test and the Influence of Outliers : - A simulation study on how the type I error rate is impacted by outliers of different magnitude.

Widerberg, Carl January 2019 (has links)
This study investigates how outliers of different magnitude impact the robustness of the twosample t-test. A simulation study approach is used to analyze the behavior of type I error rates when outliers are added to generated data. Outliers may distort parameter estimates such as the mean and variance and cause misleading test results. Previous research has shown that Welch’s ttest performs better than the traditional Student’s t-test when group variances are unequal. Therefore these two alternative statistics are compared in terms of type I error rates when outliers are added to the samples. The results show that control of type I error rates can be maintained in the presence of a single outlier. Depending on the magnitude of the outlier and the sample size, there are scenarios where the t-test is robust. However, the sensitivity of the t-test is illustrated by deteriorating type I error rates when more than one outlier are included. The comparison between Welch’s t-test and Student’s t-test shows that the former is marginally more robust against outlier influence.
17

COMPARATIVE ANALYSIS OF RURAL AND URBAN START-UP ENTREPRENEURS

Joo, Hyunjeong 01 January 2011 (has links)
This study investigates the reasons for apparent differences in entrepreneurship rates in rural and urban areas using a Survey of Rural Kentucky Residents (SRKR) and the Panel Study of Entrepreneurial Dynamics (PSED) data. We estimate the determinants of dissimilar characteristics for rural and urban areas in two aspects: one is individual and contextual resources; the other is cultural tendencies of resources. The results of the analysis suggest that the difference in available individual, economic, and social support resources does not explain the observed difference in entrepreneurship rate. The results also indicate that gender, ethnicity, income, and number of children in the family have different effects on entrepreneurial intentions in rural and urban settings. The results suggest that policy makers need to account for cultural or geographical differences when designing entrepreneurial educational and support programs in order to enhance the establishment of new business between rural and urban areas.
18

Kvantilové křivky / Quantile curves

Michl, Marek January 2017 (has links)
Modeling of quantile curves is a common problem across various fields in today's practice. The topic of this thesis is estimating quantile curves in case of two-sample gradual change. That is, when a relationship between two continuous variables in two samples is of interest, where the relationship is the same for both samples until a certain value of the explanatory variable. From that point on the relationship can differ. The result of this thesis is a procedure for estimating quantile curves, which fulfill this concept. 1
19

Nonparametric Statistical Inference for Entropy-type Functionals / Icke-parametrisk statistisk inferens för entropirelaterade funktionaler

Källberg, David January 2013 (has links)
In this thesis, we study statistical inference for entropy, divergence, and related functionals of one or two probability distributions. Asymptotic properties of particular nonparametric estimators of such functionals are investigated. We consider estimation from both independent and dependent observations. The thesis consists of an introductory survey of the subject and some related theory and four papers (A-D). In Paper A, we consider a general class of entropy-type functionals which includes, for example, integer order Rényi entropy and certain Bregman divergences. We propose U-statistic estimators of these functionals based on the coincident or epsilon-close vector observations in the corresponding independent and identically distributed samples. We prove some asymptotic properties of the estimators such as consistency and asymptotic normality. Applications of the obtained results related to entropy maximizing distributions, stochastic databases, and image matching are discussed. In Paper B, we provide some important generalizations of the results for continuous distributions in Paper A. The consistency of the estimators is obtained under weaker density assumptions. Moreover, we introduce a class of functionals of quadratic order, including both entropy and divergence, and prove normal limit results for the corresponding estimators which are valid even for densities of low smoothness. The asymptotic properties of a divergence-based two-sample test are also derived. In Paper C, we consider estimation of the quadratic Rényi entropy and some related functionals for the marginal distribution of a stationary m-dependent sequence. We investigate asymptotic properties of the U-statistic estimators for these functionals introduced in Papers A and B when they are based on a sample from such a sequence. We prove consistency, asymptotic normality, and Poisson convergence under mild assumptions for the stationary m-dependent sequence. Applications of the results to time-series databases and entropy-based testing for dependent samples are discussed. In Paper D, we further develop the approach for estimation of quadratic functionals with m-dependent observations introduced in Paper C. We consider quadratic functionals for one or two distributions. The consistency and rate of convergence of the corresponding U-statistic estimators are obtained under weak conditions on the stationary m-dependent sequences. Additionally, we propose estimators based on incomplete U-statistics and show their consistency properties under more general assumptions.
20

Automatic State Construction using Decision Trees for Reinforcement Learning Agents

Au, Manix January 2005 (has links)
Reinforcement Learning (RL) is a learning framework in which an agent learns a policy from continual interaction with the environment. A policy is a mapping from states to actions. The agent receives rewards as feedback on the actions performed. The objective of RL is to design autonomous agents to search for the policy that maximizes the expectation of the cumulative reward. When the environment is partially observable, the agent cannot determine the states with certainty. These states are called hidden in the literature. An agent that relies exclusively on the current observations will not always find the optimal policy. For example, a mobile robot needs to remember the number of doors went by in order to reach a specific door, down a corridor of identical doors. To overcome the problem of partial observability, an agent uses both current and past (memory) observations to construct an internal state representation, which is treated as an abstraction of the environment. This research focuses on how features of past events are extracted with variable granularity regarding the internal state construction. The project introduces a new method that applies Information Theory and decision tree technique to derive a tree structure, which represents the state and the policy. The relevance, of a candidate feature, is assessed by the Information Gain Ratio ranking with respect to the cumulative expected reward. Experiments carried out on three different RL tasks have shown that our variant of the U-Tree (McCallum, 1995) produces a more robust state representation and faster learning. This better performance can be explained by the fact that the Information Gain Ratio exhibits a lower variance in return prediction than the Kolmogorov-Smirnov statistical test used in the original U-Tree algorithm.

Page generated in 0.0834 seconds