Spelling suggestions: "subject:"datent aariables"" "subject:"datent cariables""
71 |
Estimating Multilevel Structural Equation Models with Random Slopes for Latent CovariatesRockwood, Nicholas John 03 July 2019 (has links)
No description available.
|
72 |
Multi-level Latent Variable Models for Integrating Multiple Phenotypes for Mental DisordersZhao, Yinjun January 2024 (has links)
The overarching goal of this dissertation is to integrate heterogeneous data for the estimation of disease coheritability and subtyping.
Chapter 2 focuses on the significance and estimation of heritability and coheritability, which quantify the proportion of phenotypic variation attributable to genetic factors and the genetic correlations between different traits, respectively. To achieve this, we develop robust statistical methods based on estimating equations that account for familial correlations and the computational challenges posed by large pedigrees and extensive datasets. Our methods are evaluated through simulations, demonstrating satisfactory consistency and robust inference properties. Compared to simpler methods performing separate trait analysis, our approaches show a greater power through joint analysis of multiple traits. An application to the analysis of heritability and coheritability in electronic health record (EHR) data reveals substantial genetic correlations between mental disorders and metabolic/endocrine measurements, suggesting shared genetic influences that warrant further investigation. These findings have implications for understanding these conditions' etiology, diagnosis, and treatment.
Chapters 3 and 4 focus on the importance of patient subtyping for personalized mental health care, particularly relevant to the substantial variability observed in mental disorders. Chapter 3 develops methods for subtyping patients with mental disorders using various data modalities and variational inference. We propose latent mixture models inspired by the Item Response Theory to handle both binary and continuous data. We also introduce Black Box Variational Inference (BBVI) algorithms to overcome the challenges of numeric integration in nonlinear models. Our numerical experiments validate the proposed methods, demonstrating that variance-controlling techniques improve convergence speed and reduce iteration variance. However, the proposed algorithm encounters limitations with latent mixture models containing binary modalities due to approximations used in non-conjugate posterior distributions resulting from the non-exponential family likelihood function.
Chapter 4 investigates multi-modal integration techniques for subtyping patients using data from the Adolescent Brain Cognitive Development (ABCD) study. We introduce a Bayesian hierarchical joint model with latent variables and utilize Pólya-Gamma augmentation for posterior approximation, which enables efficient Gibbs sampling and accurate estimation of model parameters. Extensive simulations confirm the consistency of estimators and the prediction accuracy of our method. Applying these methods to patient clustering in the ABCD study provides information for identifying potential clinical subtypes within mental health, which can inform the development of targeted psychological and educational interventions, ultimately improving mental health outcomes.
Keywords: latent mixture model, integrative analysis, coheritability, multi-modality, disease subtyping, variational inference, Pólya-Gamma
|
73 |
Dichotomous-Data Reliability Models with Auxiliary Measurements俞一唐, Yu, I-Tang Unknown Date (has links)
我們提供一個新的可靠度模型,DwACM,並提供一個模式選擇準則CCP,我們利用DwACM和CCP來選擇衰變量。 / We propose a new reliability model, DwACM (Dichotomous-data with Auxiliary Continuous Measurements model) to describe a data set which consists of classical dichotomous response (Go or No Go) associated with a set of continuous auxiliary measurement. In this model, the lifetime of each individual is considered as a latent variable. Given the value of the latent variable, the dichotomous response is either 0 or 1
depending on if it fails or not at the measuring time. The continuous measurement can be regarded as observations of an underlying possible degradation candidate of which descending process is a function of the lifetime. Under the assumption that the failure of products is defined as the time at which the
continuous measurement reaches a threshold, these two measurements can be linked in the proposed model. Statistical inference under this model are both in frequentist and Bayesian frameworks. To evaluate the continuous measurements, we provide a criterion, CCP (correct classification probability),
to select the best degradation measurement. We also report our
simulation studies of the performances of parameters estimators and CCP.
|
74 |
Les généralisations des récursivités de Kalman et leurs applications / Kalman recursion generalizations and their applicationsKadhim, Sadeq 20 April 2018 (has links)
Nous considérions des modèles à espace d'état où les observations sont multicatégorielles et longitudinales, et l'état est décrit par des modèles du type CHARN. Nous estimons l'état au moyen des récursivités de Kalman généralisées. Celles-ci reposent sur l'application d'une variété de filtres particulaires et de l’algorithme EM. Nos résultats sont appliqués à l'estimation du trait latent en qualité de vie. Ce qui fournit une alternative et une généralisation des méthodes existantes dans la littérature. Ces résultats sont illustrés par des simulations numériques et une application aux données réelles sur la qualité de vie des femmes ayant subi une opération pour cause de cancer du sein / We consider state space models where the observations are multicategorical and longitudinal, and the state is described by CHARN models. We estimate the state by generalized Kalman recursions, which rely on a variety of particle filters and EM algorithm. Our results are applied to estimating the latent trait in quality of life, and this furnishes an alternative and a generalization of existing methods. These results are illustrated by numerical simulations and an application to real data in the quality of life of patients surged for breast cancer
|
75 |
Latent variable language modelsTan, Shawn 08 1900 (has links)
No description available.
|
76 |
Essays on travel mode choice modeling : a discrete choice approach of the interactions between economic and behavioral theories / Essais sur la modélisation du choix modal : une approche par les choix discrets des interactions entre théories économiques et comportementalesBouscasse, Hélène 09 November 2017 (has links)
Cette thèse a pour objectif d’incorporer des éléments de théories de psychologie et d’économie comportementale dans des modèles de choix discret afin d’améliorer la compréhension du choix modal réalisé à l’échelle régionale. Les estimations se basent sur une enquête de type choice experiment présentée en première partie. Une deuxième partie s’intéresse à l’incorporation de variables latentes pour expliquer le choix modal. Après une revue de littérature sur les modèles de choix hybrides, c’est-à-dire des modèles combinant modèle d’équations structurelles et modèle de choix discret, un tel modèle est estimé pour montrer comment l’hétérogénéité d’outputs économiques (ici, la valeur du temps) peut être expliquée à l’aide de variables latentes (ici, le confort perçu dans les transports en commun) et de variables observables (ici, la garantie d’une place assise). La simulation de scénarios montre cependant que le gain économique (diminution de la valeur du temps) est plus élevé lorsque les politiques agissent sur des dimensions palpables que sur des dimensions latentes. S’appuyant sur un modèle de médiation, l’estimation d’un modèle d’équations structurelles montre par ailleurs que l’effet de la conscience environnementale sur les habitudes de choix modal est partiellement médié par l’utilité indirecte retirée de l’usage des transports en commun. Une troisième partie s’intéresse à deux formalisations de l’utilité issues de l’économie comportementale : 1) l’utilité dépendante au rang en situation de risque et 2) l’utilité dépendante à la référence. Dans un premier temps, un modèle d’utilité dépendante au rang est inséré dans des modèles de choix discret et, en particulier, un modèle à classes latentes, afin d’analyser l’hétérogénéité intra- et inter-individuelle lorsque le temps de déplacement n’est pas fiable. La probabilité de survenue d’un retard est sur-évaluée pour les déplacements en train et sous-évaluée pour les déplacements en voiture, en particulier pour les automobilistes, les usagers du train prenant d’avantage en compte l’espérance du temps de déplacement. Dans les modèles prenant en compte l’aversion au risque, les fonctions d’utilité sont convexes, ce qui implique une décroissance,de la valeur du temps. Dans un deuxième temps, une nouvelle famille de modèles de choix discret généralisant le modèle logit multinomial, les modèles de référence, est estimée. Sur mes données, ces modèles permettent une meilleure sélection des variables explicatives que le logit multinomial et l’estimation d’outputs économiques plus robustes, notamment en cas de forte hétérogénéité inobservée. La traduction économique des modèles de référence montre que les meilleurs modèles empiriques sont également les plus compatibles avec le modèle de dépendance à la référence de Tversky et Kahneman. / The objective of this thesis is to incorporate aspects of psychology and behavioral economics theories in discrete choice models to promote a better understanding of mode choice at regional level. Part II examines the inclusion of latent variables to explain mode choice. A literature review of integrated choice and latent variable models – that is, models combining a structural equation model and a discrete choice model – is followed by the estimation of an integrated choice and latent variable model to show how the heterogeneity of economic outputs (here, value of time) can be explained with latent variables (here, perceived comfort in public transport) and observable variables (here, the guarantee of a seat). The simulation of scenarios shows, however, that the economic gain (decrease in value of time) is higher when policies address tangible factors than when they address latent factors. On the basis of a mediation model, the estimation of a structural equation model furthermore implies that the influence of environmental concern on mode choice habits is partially mediated by the indirect utility derived frompublic transport use. Part III examines two utility formulations taken from behavioral economics: 1) rankdependent utility to model risky choices, and 2) reference-dependent utility. Firstly, a rank-dependent utility model is included in discrete choice models and, in particular, a latent-class model, in order to analyze intra- and inter-individual heterogeneity when the travel time is subject to variability. The results show that the probability of a delay is over-estimated for train travel and under-estimated for car travel, especially for car users, as train users are more likely to take into account the expected travel time. In the models that account for risk aversion, the utility functions are convex, which implies a decrease in value of time. Secondly, a new family of discrete choice models generalizing the multinomial logit model, the reference models, is estimated. On my data, these models allow for a better selection of explanatory variables than the multinomial logit model and a more robust estimation of economic outputs, particularly in cases of high unobserved heterogeneity. The economic formulation of reference models shows thatthe best empirical models are also more compatible with Tversky et Kahneman’s reference-dependent model.
|
77 |
中國股市與美國股市之共移性 / Co-movements between Chinese and American Stock Markets張瑀宸 Unknown Date (has links)
本文目的在探討中國與美國股票市場的共移性。利用2005 年至2010 年的資料,
建立中國股票在紐約證券交易所的美國存託憑證投資組合及美國股票相對應產
業的投資組合,並計算它們二者間在日間以及夜間的報酬。這個分析結果顯示,
中國股市和美國股市會因為不同的市場資訊和影響規模,而有一定程度的相關性。
此外,透過建立二階段潛在變數模型,在文中進一步推論出競爭性衝擊是影響兩
國間股票市場共移性的主因。然而,市場對人民幣與美元匯率、美國國庫券利率
報酬變化的衝擊有落後效果。而此結果可以為國際投資組合的風險分散提供更細
部的訊息。 / This paper investigates stock market co-movements betweenthe the U.S. and China.
We construct daytime and overnight returns for a portfolio of Chinese stocks using their
NYSE-traded ADRs and an industry-matched portfolio of American stocks between
2005 and 2010. The results show that Chinese stock market is linked to American stock
market through dierent sources and magnitudes of shocks. The analysis, based on the
two-stage latent variables regression, further indicates that the market correlations be-
tween China and the U.S. mostly come from competitive shocks. However, competitive
shocks of the Yuan/Dollar foreign exchange rate and Treasury bill returns have lagged
eects on the markets. The classications of shocks into competitive and global ones
suggest a ner information for international risk diversication.
|
78 |
[en] POROSITY ESTIMATION FROM SEISMIC ATTRIBUTES WITH SIMULTANEOUS CLASSIFICATION OF SPATIALLY STRUCTURED LATENT FACIES / [pt] PREDIÇÃO DE POROSIDADE A PARTIR DE ATRIBUTOS SÍSMICOS COM CLASSIFICAÇÃO SIMULTÂNEA DE FACIES GEOLÓGICAS LATENTES EM ESTRUTURAS ESPACIAISLUIZ ALBERTO BARBOSA DE LIMA 26 April 2018 (has links)
[pt] Predição de porosidade em reservatórios de óleo e gás representa em uma tarefa crucial e desafiadora na indústria de petróleo. Neste trabalho é proposto um novo modelo não-linear para predição de porosidade que trata fácies sedimentares como variáveis ocultas ou latentes. Esse modelo, denominado Transductive Conditional Random Field Regression (TCRFR), combina com sucesso os conceitos de Markov random fields, ridge regression e aprendizado transdutivo. O modelo utiliza volumes de impedância sísmica como informação de entrada condicionada aos valores de porosidade disponíveis nos poços existentes no reservatório e realiza de forma simultânea e automática a classificação das fácies e a estimativa de porosidade em todo o volume. O método é capaz de inferir as fácies latentes através da combinação de amostras precisas de porosidade local presentes nos poços com dados de impedância sísmica ruidosos, porém disponíveis em todo o volume do reservatório. A informação precisa de porosidade é propagada no volume através de modelos probabilísticos baseados em grafos, utilizando conditional random fields. Adicionalmente, duas novas técnicas são introduzidas como etapas de pré-processamento para aplicação do método TCRFR nos casos extremos em que somente um número bastante reduzido de amostras rotuladas de porosidade encontra-se disponível em um pequeno conjunto de poços exploratórios, uma situação típica para geólogos durante a fase exploratória de uma nova área. São realizados experimentos utilizando dados de um reservatório sintético e de um reservatório real. Os resultados comprovam que o método apresenta um desempenho consideravelmente superior a outros métodos automáticos de predição em relação aos dados sintéticos e, em relação aos dados reais, um desempenho comparável ao gerado por técnicas tradicionais de geo estatística que demandam grande esforço manual por parte de especialistas. / [en] Estimating porosity in oil and gas reservoirs is a crucial and challenging task in the oil industry. A novel nonlinear model for porosity estimation is proposed, which handles sedimentary facies as latent variables. It successfully combines the concepts of conditional random fields (CRFs), transductive learning and ridge regression. The proposed Transductive Conditional Random Field Regression (TCRFR) uses seismic impedance volumes as input information, conditioned on the porosity values from the available wells in the reservoir, and simultaneously and automatically provides as output the porosity estimation and facies classification in the whole volume. The method is able to infer the latent facies states by combining the local, labeled and accurate porosity information available at well locations with the plentiful but imprecise impedance information available everywhere in the reservoir volume. That accurate information is propagated in the reservoir based on conditional random field probabilistic graphical models, greatly reducing uncertainty. In addition, two new techniques are introduced as preprocessing steps for the application of TCRFR in the extreme but realistic cases where just a scarce amount of porosity labeled samples are available in a few exploratory wells, a typical situation for geologists during the evaluation of a reservoir in the exploration phase. Both synthetic and real-world data experiments are presented to prove the usefulness of the proposed methodology, which show that it outperforms previous automatic estimation methods on synthetic data and provides a comparable result to the traditional manual labored geostatistics approach on real-world data.
|
79 |
[en] NONLINEAR MODELS IN ASSESSMENT IN THE SOCIAL SCIENCES: ESTIMATION BY STOCHASTIC APPROXIMATION, A FREQUENTIST MCMC / [pt] MODELOS NÃO LINEARES EM AVALIAÇÃO NAS CIÊNCIAS SOCIAIS: ESTIMAÇÃO POR APROXIMAÇÃO ESTOCÁSTICA UMA MCMC FREQÜENTISTACARLOS ALBERTO QUADROS COIMBRA 19 July 2005 (has links)
[pt] Neste trabalho apresentamos algumas contrubuições ao
estudo dos modelos
de avaliação estatística usados nas ciências sociais. As
contribuições
originais são: i ) uma descrição unificada sobre como a
teoria da medição
evoluiu nas diversas disciplinas científicas; ii ) uma
resenha abrangente sobre
os métodos de estimação por máxima verossimilhança
empregados na
medição estatística; iii ) uma formulação geral do métodos
da máxima verossimilhan
ça tendo em vista a aplicação em modelos não-lineares; e
principalmente,
iv ) a apresentação do método da aproximação estocástica na
estimação dos modelos estatísticos de avaliação e medição.
Os modelos não-lineares ocorrem freqüentemente nas
ciências sociais onde
é importante a modelagem de variáveis de resposta
dicotômicas ou ordinais.
Em particular, este trabalho trata dos modelos da teoria
da resposta
ao item, dos modelos de regressão logística e dos modelos
de componentes
aleatórias em geral. A estimação destes modelos ainda é
objeto de intensa
pesquisa. Não se pode afirmar que exista um método de
estimação
inteiramente confiável. Os métodos aproximados produzem
estimativas com
viés acentuado nas componentes de variância, enquanto os
métodos de integração numérica e os métodos bayesianos
podem apresentar problemas de
convergência em muitos casos. O método da aproximação
estocástica se baseia
na maximização da verossimilhança e emprega o algoritmo de
Robbins-
Monro para resolver a equação do escore. Como um método
estocástico ele
gera um processo de Markov que se aproxima das estimativas
desejadas e
portanto pode ser considerado um MCMC (Monte Carlo Markov
chain)
freqüentista. Nas simulações realizadas o método
apresentou um bom desempenho,
produzindo estimativas com viés pequeno, precisão razoável
e
raros problemas de convergência. / [en] This work presents a study of statistical models used for
assessment and
measurement in the social sciences. The main contributions
are: i ) a unified
description of how evaluation, assessment, and the theory
of measurement
evolved within several branches of science; ii ) a review
of estimation
methods currently employed in nonlinear models; iii ) a
general formulation
of the maximum likelihood estimation method; and
particularly, iv the
presentation of the stochastic approximation method for
estimation of non
linear statistical models in measurement and assessment.
Non linear models occurs frequently in the social sciences
where it is
important to model binary or ordinal response variables.
This work deals
with item response theory models, logistic regression
models and general
models with random components. The estimation of these
models has been
the subject of several recent simulation studies. One
cannot say there is a
best estimation method. The approximate methods are known
to produce
biased estimates, numerical integration methods and
bayesian methods can
present convergence problems in many cases. Stochastic
approximation
method is a maximum likelihood method that uses the
Robbins-Monro
algorithm to solve the score equation. As a stochastic
approximation method
it generates a Markov chain that converges to the desired
estimates and can
be considered a frequentist MCMC. A simulation study and a
comparative
estimation study show a good performance, the method
producing small
bias for the estimates, good precision, and very rare
convergence problems.
|
80 |
Who Spoke What And Where? A Latent Variable Framework For Acoustic Scene AnalysisSundar, Harshavardhan 26 March 2016 (has links) (PDF)
Speech is by far the most natural form of communication between human beings. It is intuitive, expressive and contains information at several cognitive levels. We as humans, are perceptive to several of these cognitive levels of information, as we can gather the information pertaining to the identity of the speaker, the speaker's gender, emotion, location, the language, and so on, in addition to the content of what is being spoken. This makes speech based human machine interaction (HMI), both desirable and challenging for the same set of reasons. For HMI to be natural for humans, it is imperative that a machine understands information present in speech, at least at the level of speaker identity, language, location in space, and the summary of what is being spoken.
Although one can draw parallels between the human-human interaction and HMI, the two differ in their purpose. We, as humans, interact with a machine, mostly in the context of getting a task done more efficiently, than is possible without the machine. Thus, typically in HMI, controlling the machine in a specific manner is the primary goal. In this context, it can be argued that, HMI, with a limited vocabulary containing specific commands, would suffice for a more efficient use of the machine.
In this thesis, we address the problem of ``Who spoke what and where", in the context of a machine understanding the information pertaining to identities of the speakers, their locations in space and the keywords they spoke, thus considering three levels of information - speaker identity (who), location (where) and keywords (what). This can be addressed with the help of multiple sensors like microphones, video camera, proximity sensors, motion detectors, etc., and combining all these modalities. However, we explore the use of only microphones to address this issue. In practical scenarios, often there are times, wherein, multiple people are talking at the same time. Thus, the goal of this thesis is to detect all the speakers, their keywords, and their locations in mixture signals containing speech from simultaneous speakers. Addressing this problem of ``Who spoke what and where" using only microphone signals, forms a part of acoustic scene analysis (ASA) of speech based acoustic events.
We divide the problem of ``who spoke what and where" into two sub-problems: ``Who spoke what?" and ``Who spoke where". Each of these problems is cast in a generic latent variable (LV) framework to capture information in speech at different levels. We associate a LV to represent each of these levels and model the relationship between the levels using conditional dependency.
The sub-problem of ``who spoke what" is addressed using single channel microphone signal, by modeling the mixture signal in terms of LV mass functions of speaker identity, the conditional mass function of the keyword spoken given the speaker identity, and a speaker-specific-keyword model. The LV mass functions are estimated in a Maximum likelihood (ML) framework using the Expectation Maximization (EM) algorithm using Student's-t Mixture Model (tMM) as speaker-specific-keyword models. Motivated by HMI in a home environment, we have created our own database. In mixture signals, containing two speakers uttering the keywords simultaneously, the proposed framework achieves an accuracy of 82 % for detecting both the speakers and their respective keywords.
The other sub-problem of ``who spoke where?" is addressed in two stages. In the first stage, the enclosure is discretized into sectors. The speakers and the sectors in which they are located are detected in an approach similar to the one employed for ``who spoke what" using signals collected from a Uniform Circular Array (UCA). However, in place of speaker-specific-keyword models, we use tMM based speaker models trained on clean speech, along with a simple Delay and Sum Beamformer (DSB). In the second stage, the speakers are localized within the active sectors using a novel region constrained localization technique based on time difference of arrival (TDOA). Since the problem being addressed is a multi-label classification task, we use the average Hamming score (accuracy) as the performance metric. Although the proposed approach yields an accuracy of 100 % in an anechoic setting for detecting both the speakers and their corresponding sectors in two-speaker mixture signals, the performance degrades to an accuracy of 67 % in a reverberant setting, with a $60$ dB reverberation time (RT60) of 300 ms. To improve the performance under reverberation, prior knowledge of the location of multiple sources is derived using a novel technique derived from geometrical insights into TDOA estimation. With this prior knowledge, the accuracy of the proposed approach improves to 91 %. It is worthwhile to note that, the accuracies are computed for mixture signals containing more than 90 % overlap of competing speakers.
The proposed LV framework offers a convenient methodology to represent information at broad levels. In this thesis, we have shown its use with three different levels. This can be extended to several such levels to be applicable for a generic analysis of the acoustic scene consisting of broad levels of events. It will turn out that not all levels are dependent on each other and hence the LV dependencies can be minimized by independence assumption, which will lead to solving several smaller sub-problems, as we have shown above. The LV framework is also attractive to incorporate prior knowledge about the acoustic setting, which is combined with the evidence from the data to derive the information about the presence of an acoustic event. The performance of the framework, is dependent on the choice of stochastic models, which model the likelihood function of the data given the presence of acoustic events. However, it provides an access to compare and contrast the use of different stochastic models for representing the likelihood function.
|
Page generated in 0.181 seconds