111 |
Markov exchangeable data and mixtures of Markov ChainsDi Cecco, Davide <1980> 19 March 2009 (has links)
No description available.
|
112 |
Multiple testing in spatial epidemiology: a Bayesian approachVentrucci, Massimo <1980> 19 March 2009 (has links)
In this work we aim to propose a new approach for preliminary epidemiological
studies on Standardized Mortality Ratios (SMR) collected in many spatial
regions. A preliminary study on SMRs aims to formulate hypotheses to be investigated
via individual epidemiological studies that avoid bias carried on by
aggregated analyses. Starting from collecting disease counts and calculating expected
disease counts by means of reference population disease rates, in each
area an SMR is derived as the MLE under the Poisson assumption on each observation.
Such estimators have high standard errors in small areas, i.e. where
the expected count is low either because of the low population underlying the
area or the rarity of the disease under study. Disease mapping models and other
techniques for screening disease rates among the map aiming to detect anomalies
and possible high-risk areas have been proposed in literature according to
the classic and the Bayesian paradigm. Our proposal is approaching this issue
by a decision-oriented method, which focus on multiple testing control, without
however leaving the preliminary study perspective that an analysis on SMR
indicators is asked to. We implement the control of the FDR, a quantity largely
used to address multiple comparisons problems in the eld of microarray data
analysis but which is not usually employed in disease mapping. Controlling
the FDR means providing an estimate of the FDR for a set of rejected null
hypotheses.
The small areas issue arises diculties in applying traditional methods
for FDR estimation, that are usually based only on the p-values knowledge
(Benjamini and Hochberg, 1995; Storey, 2003). Tests evaluated by a traditional
p-value provide weak power in small areas, where the expected number of disease
cases is small. Moreover tests cannot be assumed as independent when spatial
correlation between SMRs is expected, neither they are identical distributed
when population underlying the map is heterogeneous.
The Bayesian paradigm oers a way to overcome the inappropriateness of
p-values based methods. Another peculiarity of the present work is to propose
a hierarchical full Bayesian model for FDR estimation in testing many null
hypothesis of absence of risk.We will use concepts of Bayesian models for disease
mapping, referring in particular to the Besag York and Mollié model (1991) often
used in practice for its exible prior assumption on the risks distribution across
regions. The borrowing of strength between prior and likelihood typical of a
hierarchical Bayesian model takes the advantage of evaluating a singular test
(i.e. a test in a singular area) by means of all observations in the map under
study, rather than just by means of the singular observation. This allows to
improve the power test in small areas and addressing more appropriately the
spatial correlation issue that suggests that relative risks are closer in spatially
contiguous regions.
The proposed model aims to estimate the FDR by means of the MCMC
estimated posterior probabilities b i's of the null hypothesis (absence of risk) for
each area. An estimate of the expected FDR conditional on data (\FDR) can
be calculated in any set of b i's relative to areas declared at high-risk (where thenull hypothesis is rejected) by averaging the b i's themselves. The\FDR can be
used to provide an easy decision rule for selecting high-risk areas, i.e. selecting
as many as possible areas such that the\FDR is non-lower than a prexed
value; we call them\FDR based decision (or selection) rules. The sensitivity
and specicity of such rule depend on the accuracy of the FDR estimate, the
over-estimation of FDR causing a loss of power and the under-estimation of
FDR producing a loss of specicity. Moreover, our model has the interesting
feature of still being able to provide an estimate of relative risk values as in the
Besag York and Mollié model (1991).
A simulation study to evaluate the model performance in FDR estimation
accuracy, sensitivity and specificity of the decision rule, and goodness of
estimation of relative risks, was set up. We chose a real map from which we
generated several spatial scenarios whose counts of disease vary according to
the spatial correlation degree, the size areas, the number of areas where the
null hypothesis is true and the risk level in the latter areas. In summarizing
simulation results we will always consider the FDR estimation in sets
constituted by all b i's selected lower than a threshold t. We will show graphs of
the\FDR and the true FDR (known by simulation) plotted against a threshold
t to assess the FDR estimation. Varying the threshold we can learn which FDR
values can be accurately estimated by the practitioner willing to apply the model
(by the closeness between\FDR and true FDR). By plotting the calculated
sensitivity and specicity (both known by simulation) vs the\FDR we can
check the sensitivity and specicity of the corresponding\FDR based decision
rules. For investigating the over-smoothing level of relative risk estimates we will
compare box-plots of such estimates in high-risk areas (known by simulation),
obtained by both our model and the classic Besag York Mollié model. All the
summary tools are worked out for all simulated scenarios (in total 54 scenarios).
Results show that FDR is well estimated (in the worst case we get an overestimation,
hence a conservative FDR control) in small areas, low risk levels and
spatially correlated risks scenarios, that are our primary aims. In such scenarios
we have good estimates of the FDR for all values less or equal than 0.10. The
sensitivity of\FDR based decision rules is generally low but specicity is high.
In such scenario the use of\FDR = 0:05 or\FDR = 0:10 based selection rule can
be suggested. In cases where the number of true alternative hypotheses (number
of true high-risk areas) is small, also FDR = 0:15 values are well estimated, and
\FDR = 0:15 based decision rules gains power maintaining an high specicity.
On the other hand, in non-small areas and non-small risk level scenarios the
FDR is under-estimated unless for very small values of it (much lower than
0.05); this resulting in a loss of specicity of a\FDR = 0:05 based decision rule.
In such scenario\FDR = 0:05 or, even worse,\FDR = 0:1 based decision rules
cannot be suggested because the true FDR is actually much higher. As regards
the relative risk estimation, our model achieves almost the same results of the
classic Besag York Molliè model. For this reason, our model is interesting for
its ability to perform both the estimation of relative risk values and the FDR
control, except for non-small areas and large risk level scenarios. A case of study
is nally presented to show how the method can be used in epidemiology.
|
113 |
A bayesian model for mixed responses and skew laten variable to measure HRQoL in childrenBroccoli, Serena <1982> 30 March 2010 (has links)
The aim of the thesi is to formulate a suitable Item Response Theory (IRT) based model to measure HRQoL (as latent variable) using a mixed responses questionnaire and relaxing the hypothesis of normal distributed latent variable. The new model is a combination of two models already presented in literature, that is, a latent trait model for mixed responses and an IRT model for Skew Normal latent variable. It is developed in a Bayesian framework, a Markov chain Monte Carlo procedure is used to generate samples of the posterior distribution of the parameters of interest. The proposed model is test on a questionnaire composed by 5 discrete items and one continuous to measure HRQoL in children, the EQ-5D-Y questionnaire. A large sample of children collected in the schools was used. In comparison with a model for only discrete responses and a model for mixed responses and normal latent variable, the new model has better performances, in term of deviance information criterion (DIC), chain convergences times and precision of the estimates.
|
114 |
Adaptive Markov Chain Monte Carlo: a new mixture based algorithm with applications to Bayesian ModelingDi Narzo, Antonio <1982> 30 March 2010 (has links)
No description available.
|
115 |
Metodi statistici a variabili latenti per lo studio di fenomeni finanziariDe Angelis, Luca <1978> 30 March 2010 (has links)
Negli ultimi decenni il concetto di variabile latente ha riscosso un enorme successo nelle discipline statistiche come attestano i numerosi lavori scientifici presenti in letteratura. In particolare, nelle scienze sociali e in psicometria, l’uso del concetto di variabile latente è stato largamente adottato per far fronte al problema di misurare quantità che, in natura, non possono essere direttamente osservate. La vasta letteratura riguardante questa metodologia si espande, in maniera più limitata, anche al campo della ricerca economica ed econometrica. Nonostante esistano studi di modelli a struttura latente applicati a variabili di tipo economico, molto pochi sono i lavori che considerano variabili finanziarie e, finora, praticamente nessun ricercatore ha messo in connessione la teoria standard di portafoglio con la metodologia dei modelli statistici a variabili latenti. L’obiettivo del lavoro è quello di ricorrere alle potenzialità esplicative ed investigative dei metodi statistici a variabili latenti per l’analisi dei fenomeni finanziari. Si fa riferimento, in particolare, ai modelli a classe latente che consentono di sviluppare soluzioni metodologicamente corrette per importanti problemi ancora aperti in campo finanziario. In primo luogo, la natura stessa delle variabili finanziarie è riconducibile al paradigma delle variabili latenti. Infatti, variabili come il rischio ed il rendimento atteso non possono essere misurate direttamente e necessitano di approssimazioni per valutarne l’entità. Tuttavia, trascurare la natura non osservabile delle variabili finanziarie può portare a decisioni di investimento inopportune o, talvolta, addirittura disastrose. Secondariamente, vengono prese in considerazione le capacità dei modelli a classi latenti nel contesto della classificazione. Per i prodotti finanziari, infatti, una corretta classificazione sulla base del profilo (latente) di rischio e rendimento rappresenta il presupposto indispensabile per poter sviluppare efficaci strategie di investimento. Ci si propone, inoltre, di sviluppare un collegamento, finora mancante, tra uno dei principali riferimenti della finanza moderna, la teoria classica del portafoglio di Markowitz, e la metodologia statistica dei modelli a variabili latenti. In questo contesto, si vogliono investigare, in particolare, i benefici che i modelli a variabili latenti possono dare allo studio di ottimizzazione del profilo rischio - rendimento atteso di un portafoglio di attività finanziarie. Lo sviluppo di numeri indici dei prezzi delle attività finanziarie caratterizzati da una solida base metodologica rappresenta un ulteriore aspetto nel quale i modelli a classe latente possono svolgere un ruolo di fondamentale importanza. In particolare, si propone di analizzare il contesto dei numeri indici dei prezzi settoriali, che costituiscono uno dei riferimenti più importanti nelle strategie di diversificazione del rischio. Infine, il passaggio da una specificazione statica ad una analisi dinamica coglie aspetti metodologici di frontiera che possono essere investigati nell’ambito dei modelli markoviani a classi latenti. Il profilo latente di rischio – rendimento può essere, così, investigato in riferimento alle diverse fasi dei mercati finanziari, per le quali le probabilità di transizione consentono valutazioni di tipo previsivo di forte interesse.
|
116 |
Combination of gene expression studies. An overview, critical aspects and a proposal of applicationFreni Sterrantino, Anna <1978> 30 March 2010 (has links)
No description available.
|
117 |
Gli effetti degli episodi di disoccupazione sulla durata della ricerca di lavoro. Uno studio sui dati del panel europeoMazzocchetti, Angelina <1976> 20 April 2004 (has links)
Nel corso degli ultimi venti anni si è assistito ad un incremento generalizzato della disoccupazione in Europa in particolare di quella giovanile: nel 1997 in molti paesi europei il tasso di disoccupazione della classe 15-24 anni è doppio rispetto a quello degli adulti.
Questo lavoro si propone di dare una descrizione realistica del fenomeno della disoccupazione giovanile in Italia come risultante dalle prime 4 waves dell’indagine European Household Community Panel indagando la probabilità di transizione dallo stato di disoccupazione a quello di occupazione. Nell’ambito di un’impostazione legata alla teoria dei processi stocastici e dei dati di durata si indagheranno gli effetti che episodi di disoccupazione precedenti possono avere sulla probabilità di trovare un lavoro, in particolare, nell’ambito di processi stocastici più generali si rilascerà l’ipotesi di semi-markovianità del processo per considerare l’effetto di una funzione della storia passata del processo sulla transizione attuale al lavoro. La stima della funzione di rischio a vari intervalli di durata si dimostra più appropriata quando si tiene conto della storia passata del processo, in particolare, si verifica l’ipotesi che la possibilità di avere successo nella ricerca di un lavoro è negativamente influenzata dall’aver avuto in passato molte transizioni disoccupazione-occupazione-disoccupazione.
|
118 |
Indicatori di correlazione e di disordine basati sul concetto di entropiaBertozzi, Fabio <1965> 18 March 2011 (has links)
The aim of this work is to carry out an applicative, comparative and exhaustive study between several entropy based indicators of independence and correlation.
We considered some indicators characterized by a wide and consolidate literature, like mutual information, joint entropy, relative entropy or Kullback Leibler distance, and others, more recently introduced, like Granger, Maasoumi and racine entropy, also called Sρ, or utilized in more restricted domains, like Pincus approximate entropy or ApEn.
We studied the behaviour of such indicators applying them to binary series. The series was designed to simulate a wide range of situations in order to characterize indicators limit and capability and to identify, case by case, the more useful and trustworthy ones.
Our target was not only to study if such indicators were able to discriminate between dependence and independence because, especially for mutual information and Granger, Maasoumi and Racine, that was already demonstrated and reported in literature, but also to verify if and how they were able to provide information about structure, complexity and disorder of the series they were applied to.
Special attention was paid on Pincus approximate entropy, that is said by the author to be able to provide information regarding the level of randomness, regularity and complexity of a series.
By means of a focused and extensive research, we furthermore tried to clear the meaning of ApEn applied to a couple of different series. In such situation the indicator is named in literature as cross-ApEn.
The cross-ApEn meaning and the interpretation of its results is often not simple nor univocal and the matter is scarcely delved into by literature, thereby users can easily leaded up to a misleading conclusion, especially if the indicator is employed, as often unfortunately it happens, in uncritical manner.
In order to plug some cross-ApEn gaps and limits clearly brought out during the experimentation, we developed and applied to the already considered cases a further indicator we called “correspondence index”. The correspondence index is perfectly integrated into the cross-ApEn computational algorithm and it is able to provide, at least for binary data, accurate information about the intensity and the direction of an eventual correlation, even not linear, existing between two different series allowing, in the meanwhile, to detect an eventual condition of independence between the series themselves.
|
119 |
Inference on copula-based correlation structuresFoscolo, Enrico <1983> 18 March 2011 (has links)
We propose an extension of the approach provided by Kluppelberg and Kuhn (2009) for inference on second-order structure moments. As in Kluppelberg and Kuhn (2009) we adopt a copula-based approach instead of assuming normal distribution for the variables, thus relaxing the equality in distribution assumption. A new copula-based estimator for structure moments is investigated. The methodology provided by Kluppelberg and Kuhn (2009) is also extended considering the copulas associated with the family of Eyraud-Farlie-Gumbel-Morgenstern distribution functions (Kotz, Balakrishnan, and Johnson, 2000, Equation 44.73). Finally, a comprehensive simulation study and an application to real financial data are performed in order to compare the different approaches.
|
120 |
An evolutionary approach to the design of experiments for combinatorial optimization with an application to enzyme engineeringBorrotti, Matteo <1981> 18 March 2011 (has links)
In a large number of problems the high dimensionality of the search space, the vast number of variables and the economical constrains limit the ability of classical techniques to reach the optimum of a function, known or unknown. In this thesis we investigate the possibility to combine approaches from advanced statistics and optimization algorithms in such a way to better explore the combinatorial search space and to increase the performance of the approaches. To this purpose we propose two methods: (i) Model Based Ant Colony Design and (ii) Naïve Bayes Ant Colony Optimization. We test the performance of the two proposed solutions on a simulation study and we apply the novel techniques on an appplication in the field of Enzyme Engineering and Design.
|
Page generated in 0.0442 seconds