Global ETD Search

51	Structural equation models applied to quantitative genetics / Modelos de equações estruturais aplicados à genética quantitativa Cerqueira, Pedro Henrique Ramos 03 September 2015 (has links) Causal models have been used in different areas of knowledge in order to comprehend the causal associations between variables. Over the past decades, the amount of studies using these models have been growing a lot, especially those related to biological systems where studying and learning causal relationships among traits are essential for predicting the consequences of interventions in such system. Graph analysis (GA) and structural equation modeling (SEM) are tools used to explore such associations. While GA allows searching causal structures that express qualitatively how variables are causally connected, fitting SEM with a known causal structure allows to infer the magnitude of causal effects. Also SEM can be viewed as multiple regression models in which response variables can be explanatory variables for others. In quantitative genetics studies, SEM aimed to study the direct and indirect genetic effects associated to individuals through information related to them, beyond the observed characteristics, such as the kinship relations. In those studies typically the assumptions of linear relationships among traits are made. However, in some scenarios, nonlinear relationships can be observed, which make unsuitable the mentioned assumptions. To overcome this limitation, this paper proposes to use a mixed effects polynomial structural equation model, second or superior degree, to model those nonlinear relationships. Two studies were developed, a simulation and an application to real data. The first study involved simulation of 50 data sets, with a fully recursive causal structure involving three characteristics in which linear and nonlinear causal relations between them were allowed. The second study involved the analysis of traits related to dairy cows of the Holstein breed. Phenotypic relationships between traits were calving difficulty, gestation length and also the proportion of perionatal death. We compare the model of multiple traits and polynomials structural equations models, under different polynomials degrees in order to assess the benefits of the SEM polynomial of second or higher degree. For some situations the inappropriate assumption of linearity results in poor predictions of the direct, indirect and total of the genetic variances and covariance, either overestimating, underestimating, or even assign opposite signs to covariances. Therefore, we conclude that the inclusion of a polynomial degree increases the SEM expressive power. / Modelos causais têm sido muitos utilizados em estudos em diferentes áreas de conhecimento, a fim de compreender as associações ou relações causais entre variáveis. Durante as últimas décadas, o uso desses modelos têm crescido muito, especialmente estudos relacionados à sistemas biológicos, uma vez que compreender as relações entre características são essenciais para prever quais são as consequências de intervenções em tais sistemas. Análise do grafo (AG) e os modelos de equações estruturais (MEE) são utilizados como ferramentas para explorar essas relações. Enquanto AG nos permite buscar por estruturas causais, que representam qualitativamente como as variáveis são causalmente conectadas, ajustando o MEE com uma estrutura causal conhecida nos permite inferir a magnitude dos efeitos causais. Os MEE também podem ser vistos como modelos de regressão múltipla em que uma variável resposta pode ser vista como explanatória para uma outra característica. Estudos utilizando MEE em genética quantitativa visam estudar os efeitos genéticos diretos e indiretos associados aos indivíduos por meio de informações realcionadas aos indivíduas, além das característcas observadas, como por exemplo o parentesco entre eles. Neste contexto, é tipicamente adotada a suposição que as características observadas são relacionadas linearmente. No entanto, para alguns cenários, relações não lineares são observadas, o que torna as suposições mencionadas inadequadas. Para superar essa limitação, este trabalho propõe o uso de modelos de equações estruturais de efeitos polinomiais mistos, de segundo grau ou seperior, para modelar relações não lineares. Neste trabalho foram desenvolvidos dois estudos, um de simulação e uma aplicação a dados reais. O primeiro estudo envolveu a simulação de 50 conjuntos de dados, com uma estrutura causal completamente recursiva, envolvendo 3 características, em que foram permitidas relações causais lineares e não lineares entre as mesmas. O segundo estudo envolveu a análise de características relacionadas ao gado leiteiro da raça Holandesa, foram utilizadas relações entre os seguintes fenótipos: dificuldade de parto, duração da gestação e a proporção de morte perionatal. Nós comparamos o modelo misto de múltiplas características com os modelos de equações estruturais polinomiais, com diferentes graus polinomiais, a fim de verificar os benefícios do MEE polinomial de segundo grau ou superior. Para algumas situações a suposição inapropriada de linearidade resulta em previsões pobres das variâncias e covariâncias genéticas diretas, indiretas e totais, seja por superestimar, subestimar, ou mesmo atribuir sinais opostos as covariâncias. Portanto, verificamos que a inclusão de um grau de polinômio aumenta o poder de expressão do MEE. Amostrador de Gibbs Bayesian inference Gado leiteiro da raça Holandesa Genética quantitativa Gibbs sampler Holstein dairy cattle Inferência bayesiana Linear mixed models Modelos de equações estruturais Modelos lineares mistos Modelos mistos multi característicos Multiple trait mixed models Polynomial regression Quantitative genetics Regressão polinomial Structural equation models
52	Structural equation models applied to quantitative genetics / Modelos de equações estruturais aplicados à genética quantitativa Pedro Henrique Ramos Cerqueira 03 September 2015 (has links) Causal models have been used in different areas of knowledge in order to comprehend the causal associations between variables. Over the past decades, the amount of studies using these models have been growing a lot, especially those related to biological systems where studying and learning causal relationships among traits are essential for predicting the consequences of interventions in such system. Graph analysis (GA) and structural equation modeling (SEM) are tools used to explore such associations. While GA allows searching causal structures that express qualitatively how variables are causally connected, fitting SEM with a known causal structure allows to infer the magnitude of causal effects. Also SEM can be viewed as multiple regression models in which response variables can be explanatory variables for others. In quantitative genetics studies, SEM aimed to study the direct and indirect genetic effects associated to individuals through information related to them, beyond the observed characteristics, such as the kinship relations. In those studies typically the assumptions of linear relationships among traits are made. However, in some scenarios, nonlinear relationships can be observed, which make unsuitable the mentioned assumptions. To overcome this limitation, this paper proposes to use a mixed effects polynomial structural equation model, second or superior degree, to model those nonlinear relationships. Two studies were developed, a simulation and an application to real data. The first study involved simulation of 50 data sets, with a fully recursive causal structure involving three characteristics in which linear and nonlinear causal relations between them were allowed. The second study involved the analysis of traits related to dairy cows of the Holstein breed. Phenotypic relationships between traits were calving difficulty, gestation length and also the proportion of perionatal death. We compare the model of multiple traits and polynomials structural equations models, under different polynomials degrees in order to assess the benefits of the SEM polynomial of second or higher degree. For some situations the inappropriate assumption of linearity results in poor predictions of the direct, indirect and total of the genetic variances and covariance, either overestimating, underestimating, or even assign opposite signs to covariances. Therefore, we conclude that the inclusion of a polynomial degree increases the SEM expressive power. / Modelos causais têm sido muitos utilizados em estudos em diferentes áreas de conhecimento, a fim de compreender as associações ou relações causais entre variáveis. Durante as últimas décadas, o uso desses modelos têm crescido muito, especialmente estudos relacionados à sistemas biológicos, uma vez que compreender as relações entre características são essenciais para prever quais são as consequências de intervenções em tais sistemas. Análise do grafo (AG) e os modelos de equações estruturais (MEE) são utilizados como ferramentas para explorar essas relações. Enquanto AG nos permite buscar por estruturas causais, que representam qualitativamente como as variáveis são causalmente conectadas, ajustando o MEE com uma estrutura causal conhecida nos permite inferir a magnitude dos efeitos causais. Os MEE também podem ser vistos como modelos de regressão múltipla em que uma variável resposta pode ser vista como explanatória para uma outra característica. Estudos utilizando MEE em genética quantitativa visam estudar os efeitos genéticos diretos e indiretos associados aos indivíduos por meio de informações realcionadas aos indivíduas, além das característcas observadas, como por exemplo o parentesco entre eles. Neste contexto, é tipicamente adotada a suposição que as características observadas são relacionadas linearmente. No entanto, para alguns cenários, relações não lineares são observadas, o que torna as suposições mencionadas inadequadas. Para superar essa limitação, este trabalho propõe o uso de modelos de equações estruturais de efeitos polinomiais mistos, de segundo grau ou seperior, para modelar relações não lineares. Neste trabalho foram desenvolvidos dois estudos, um de simulação e uma aplicação a dados reais. O primeiro estudo envolveu a simulação de 50 conjuntos de dados, com uma estrutura causal completamente recursiva, envolvendo 3 características, em que foram permitidas relações causais lineares e não lineares entre as mesmas. O segundo estudo envolveu a análise de características relacionadas ao gado leiteiro da raça Holandesa, foram utilizadas relações entre os seguintes fenótipos: dificuldade de parto, duração da gestação e a proporção de morte perionatal. Nós comparamos o modelo misto de múltiplas características com os modelos de equações estruturais polinomiais, com diferentes graus polinomiais, a fim de verificar os benefícios do MEE polinomial de segundo grau ou superior. Para algumas situações a suposição inapropriada de linearidade resulta em previsões pobres das variâncias e covariâncias genéticas diretas, indiretas e totais, seja por superestimar, subestimar, ou mesmo atribuir sinais opostos as covariâncias. Portanto, verificamos que a inclusão de um grau de polinômio aumenta o poder de expressão do MEE. Amostrador de Gibbs Gado leiteiro da raça Holandesa Genética quantitativa Inferência bayesiana Modelos de equações estruturais Modelos lineares mistos Modelos mistos multi característicos Regressão polinomial Bayesian inference Gibbs sampler Holstein dairy cattle Linear mixed models Multiple trait mixed models Polynomial regression Quantitative genetics Structural equation models
53	Estimating the load weight of freight trains using machine learning Kongpachith, Erik January 2023 (has links) Accurate estimation of the load weight of freight trains is crucial for ensuring safe, efficient and sustainable rail freight transports. Traditional methods for estimating load weight often suffer from limitations in accuracy and efficiency. In recent years, machine learning algorithms have gained significant attention and use cases within the railway industry due to their strong predictive capabilities for classification and regression tasks. This study aims to present a proof of concept in the form of a comparative analysis of five machine learning regression algorithms: Polynomial Regression, K-Nearest Neighbors, Regression Trees, Random Forest Regression, and Support Vector Regression for estimating the load weight of freight trains using simulation data. The study utilizes two comprehensive datasets derived from train simulations in GENSYS, a simulation software for modeling rail vehicles. The datasets encompasses various driving condition factors such as train speed, track conditions and running gear configurations. The algorithms are trained and evaluated on these datasets and their performance is evaluated based on the root mean squared error and R2 metrics. Results from the experiments demonstrate that all five machine learning algorithms show promising performance for estimating the load weight. Polynomial regression achieves the best result for both of the datasets when using many features of the datasets are considered. Random forest regression achieves the best result for both of the data sets when a small number features of the datasets are considered. Furthermore, it is suggested that the methodical approach of this study is examined on real world data from operating freight trains to assert the proof of concept in a real world setting. / Noggrann uppskattning av godstågens lastvikt är avgörande för att säkerställa säkra, effektiva och hållbara godstransporter via järnväg. Traditionella metoder för att uppskatta lastvikt lider ofta av begränsningar i noggrannhet och effektivitet. Under de senaste åren har maskininlärningsalgoritmer fått betydande uppmärksamhet och användningsfall inom järnvägsindustrin på grund av deras starka prediktiva förmåga för klassificerings- och regressionsproblem. Denna studie syftar till att presentera en proof of concept i form av en jämförande analys av fem maskininlärningalgoritmer för regression: Polynom regression, K-Nearest Neighbors, Regression träd, Random Forest Regression och Support Vector Regression för att uppskatta lastvikten för godståg med hjälp av simuleringsdata. Studien använder två omfattande dataset konstruerade från tågsimuleringar i GENSYS, en simuleringsprogramvara för modellering av järnvägsfordon. Dataseten omfattar olika körfaktorer såsom tåghastighet, spårförhållanden och vagns konfigurationer. Algoritmerna tränas och utvärderas på dessa dataset och deras prestanda utvärderas baserat på root mean squared error och R2 måtten. Resultat från experimenten visar att alla fem maskininlärningsalgoritmerna visar lovande prestanda för att uppskatta lastvikten. Polynom regression uppnår det bästa resultatet för båda dataset när många variabler i datan beaktas. Random Forest Regression ger det bästa resultatet för båda dataset när ett mindre antal variabler i datan beaktas. Det föreslås det att det metodiska tillvägagångssättet för denna studie undersöks på verklig data från aktiva godståg för att fastställa en proof of concept på en verklig världsbild. Railway freight Transport Rail Vehicle Weighing Y25 Bogie Sdggmrss T3000eD GENSYS Machine Learning Regression Polynomial Regression Regression Trees Random Forest Regression Support Vector Regression Järnvägsgods transport Vägning av järnvägsfordon Y25 Bogie Sdggmrss T3000eD GENSYS Maskininlärning Regression Polynom Regression Regressionsträd Random Forest Regression Support Vector Regression Computer and Information Sciences Data- och informationsvetenskap

Page generated in 0.0689 seconds