Global ETD Search

231	Modelling of Level Crossing Accident Risk Sleep, Julie January 2008 (has links) This thesis details the development of a model of driver behaviour at railway level crossings that allows the probability of an accident under different conditions and interventions to be calculated. A method for classifying different crossings according to their individual risk levels is also described. Applied Statistics Mathematics not elsewhere classified Expert Systems Simulation and Modelling bayesian belief networks self-organising maps railway level crossings mathematical modelling
232	Applications of Game Theory, Tableau, Analytics, and R to Fashion Design Asiri, Aisha 08 August 2018 (has links) This thesis presents various models to the fashion industry to predict the profits for some products. To determine the expected performance of each product in 2016, we used tools of game theory to help us identify the expected value. We went further and performed a simple linear regression and used scatter plots to help us predict further the performance of the products of Prada. We used tools of game theory, analytics, and statistics to help us predict the performance of some of Prada's products. We also used the Tableau platform to visualize an overview of the products' performances. All of these tools were used to aid in finding better predictions of Prada's product performances. Game theory expected value Prada’s products performance Fashion Design prediction Leather Goods Footwear Clothing "Others" Tableau and R. Analysis Applied Statistics Mathematics Probability Statistical Models Statistics and Probability
233	Modelos lineares mistos em dados longitudionais com o uso do pacote ASReml-R / Linear Mixed Models with longitudinal data using ASReml-R package Renata Alcarde 10 April 2012 (has links) Grande parte dos experimentos instalados atualmente é planejada para que sejam realizadas observações ao longo do tempo, ou em diferentes profundidades, enfim, tais experimentos geralmente contem um fator longitudinal. Uma maneira de se analisar esse tipo de conjunto de dados é utilizando modelos mistos, por meio da inclusão de fatores de efeito aleatório e, fazendo uso do método da máxima verossimilhança restrita (REML), podem ser estimados os componentes de variância associados a tais fatores com um menor viés. O pacote estatístico ASReml-R, muito eficiente no ajuste de modelos lineares mistos por possuir uma grande variedade de estruturas para as matrizes de variâncias e covariâncias já implementadas, apresenta o inconveniente de nao ter como objetos as matrizes de delineamento X e Z, nem as matrizes de variâncias e covariâncias D e , sendo estas de grande importância para a verificação das pressuposições do modelo. Este trabalho reuniu ferramentas que facilitam e fornecem passos para a construção de modelos baseados na aleatorização, tais como o diagrama de Hasse, o diagrama de aleatorização e a construção de modelos mistos incluindo fatores longitudinais. Sendo o vetor de resíduos condicionais e o vetor de parâmetros de efeitos aleatórios confundidos, ou seja, não independentes, foram obtidos resíduos, denominados na literatura, resíduos com confundimento mínimo e, como proposta deste trabalho foi calculado o EBLUP com confudimento mínimo. Para tanto, foram implementadas funções que, utilizando os objetos de um modelo ajustado com o uso do pacote estatístico ASReml-R, tornam disponíveis as matrizes de interesse e calculam os resíduos com confundimento mínimo e o EBLUP com confundimento m´nimo. Para elucidar as técnicas neste apresentadas e salientar a importância da verificação das pressuposições do modelo adotado, foram considerados dois exemplos contendo fatores longitudinais, sendo o primeiro um experimento simples, visando a comparação da eficiência de diferentes coberturas em instalações avícolas, e o segundo um experimento realizado em três fases, contendo fatores inteiramente confundidos, com o objetivos de avaliar características do papel produzido por diferentes espécies de eucaliptos em diferentes idades. / Currently, most part of the experiments installed is designed to be carried out observations over time or at different depths. These experiments usually have a longitudinal factor. One way of analyzing this data set is by using mixed models through means of inclusion of random effect factors, and it is possible to estimate the variance components associated to such factors with lower bias by using the Restricted maximum likelihood method (REML). The ASRemi-R statistic package, very efficient in fitting mixed linear models because it has a wide variety of structures for the variance - covariance matrices already implemented, presents the disadvantage of having neither the design matricesX and Z, nor the variance - covariance matrices D and , and they are very important to verify the assumption of the model. This paper gathered tools which facilitate and provide steps to build models based on randomization such as the Hasse diagram, randomization diagram and the mixed model formulations including longitudinal factors. Since the conditional residuals and random effect parameters are confounded, that is, not independent, it was calculated residues called in the literature as least confounded residuals and as a proposal of this work, it was calculated the least confound EBLUP. It was implemented functions which using the objects of fitted models with the use of the ASReml-R statistic package becoming available the matrices of interests and calculate the least confounded residuals and the least confounded EBLUP. To elucidate the techniques shown in this paper and highlight the importance of the verification of the adopted models assumptions, it was considered two examples with longitudinal factors. The former example was a simple experiment and the second one conducted in three phases, containing completely confounded factors, with the purpose of evaluating the characteristics of the paper produced by different species of eucalyptus from different ages. Análise de dados longitudinais Aviários Estatística aplicada Eucalipto Experimentos agrícolas Modelos lineares mistos Verossimilhança Agricultural experiments Applied statistics Aviaries Eucalyptus Likelihood Linear mixed models Longitudinal data analysis
234	Mapeamento de QTLs utilizando as abordagens Clássica e Bayesiana / Mapping QTLs: Classical and Bayesian approaches Elisabeth Regina de Toledo 02 October 2006 (has links) A produção de grãos e outros caracteres de importância econômica para a cultura do milho, tais como a altura da planta, o comprimento e o diâmetro da espiga, apresentam herança poligênica, o que dificulta a obtenção de informações sobre as bases genéticas envolvidas na variação desses caracteres. Associações entre marcadores e QTLs foram analisadas através dos métodos de mapeamento por intervalo composto (CIM) e mapeamento por intervalo Bayesiano (BIM). A partir de um conjunto de dados de produção de grãos, referentes à avaliação de 256 progênies de milho genotipadas para 139 marcadores moleculares codominantes, verificou-se que as metodologias apresentadas permitiram classificar marcas associadas a QTLs. Através do procedimento CIM, associações entre marcadores e QTLs foram consideradas significativas quando o valor da estatística de razão de verossimilhança (LR) ao longo do cromossomo atingiu o valor máximo dentre os que ultrapassaram o limite crítico LR = 11; 5 no intervalo considerado. Dez QTLs foram mapeados distribuídos em três cromossomos. Juntos, explicaram 19,86% da variância genética. Os tipos de interação alélica predominantes foram de dominância parcial (quatro QTLs) e dominância completa (três QTLs). O grau médio de dominância calculado foi de 1,12, indicando grau médio de dominância completa. Grande parte dos alelos favoráveis ao caráter foram provenientes da linhagem parental L0202D, que apresentou mais elevada produção de grãos. Adotando-se a abordagem Bayesiana, foram implementados métodos de amostragem através de cadeias de Markov (MCMC), para obtenção de uma amostra da distribuição a posteriori dos parâmetros de interesse, incorporando as crenças e incertezas a priori. Resumos sobre as localizações dos QTLs e seus efeitos aditivo e de dominância foram obtidos. Métodos MCMC com saltos reversíveis (RJMCMC) foram utilizados para a análise Bayesiana e Fator calculado de Bayes para estimar o número de QTLs. Através do método BIM associações entre marcadores e QTLs foram consideradas significativas em quatro cromossomos, com um total de cinco QTLs mapeados. Juntos, esses QTLs explicaram 13,06% da variância genética. A maior parte dos alelos favoráveis ao caráter também foram provenientes da linhagem parental L02-02D. / Grain yield and other important economic traits in maize, such as plant heigth, stalk length, and stalk diameter, exhibit polygenic inheritance, making dificult information achievement about the genetic bases related to the variation of these traits. The number and sites of (QTLs) loci that control grain yield in maize have been estimated. Associations between markers and QTLs were undertaken by composite interval mapping (CIM) and Bayesian interval mapping (BIM). Based on a set of grain yield data, obtained from the evaluation of 256 maize progenies genotyped for 139 codominant molecular markers, the presented methodologies allowed classification of markers associated to QTLs.Through composite interval mapping were significant when value of likelihood ratio (LR) throughout the chromosome surpassed LR = 11; 5. Significant associations between markers and QTLs were obtained in three chromosomes, ten QTLs has been mapped, which explained 19; 86% of genetic variation. Predominant genetic action for mapped QTLs was partial dominance and (four QTLs) complete dominance (tree QTLs). Average dominance amounted to 1,12 and confirmed complete dominance for grain yield. Most alleles that contributed positively in trait came from parental strain L0202D. The latter had the highest yield rate. Adopting a Bayesian approach to inference, usually implemented via Markov chain Monte Carlo (MCMC). The output of a Bayesian analysis is a posterior distribution on the parameters, fully incorporating prior beliefs and parameter uncertainty. Reversible Jump MCMC (RJMCMC) is used in this work. Bayes Factor is used to estimate the number of QTL. Through Bayesian interval, significant associations between markers and QTLs were obtained in four chromosomes and five QTLs has been mapped, which explained 13; 06% of genetic variation. Most alleles that contributed positively in trait came from parental strain L02-02D. The latter had the highest yield rate. Estatística aplicada Grãos Mapeamento genético Marcador molecular Milho Verossimilhança Applied Statistics Bayesian interval mapping Composite interval mapping Genetic mapping Grains Likelihood Quantitative trait loci Reversible jump MCMC Yeld
235	Statistical Learning of Proteomics Data and Global Testing for Data with Correlations Donglai Chen (6405944) 15 May 2019 (has links) <div>This dissertation consists of two parts. The first part is a collaborative project with Dr. Szymanski's group in Agronomy at Purdue, to predict protein complex assemblies and interactions. Proteins in the leaf cytosol of Arabidopsis were fractionated using Size Exclusion Chromatography (SEC) and mixed-bed Ion Exchange Chromatography (IEX).</div><div>Protein mass spectrometry data were obtained for the two platforms of separation and two replicates of each. We combine the four data sets and conduct a series of statistical learning, including 1) data filtering, 2) a two-round hierarchical clustering to integrate multiple data types, 3) validation of clustering based on known protein complexes,</div><div>4) mining dendrogram trees for prediction of protein complexes. Our method is developed for integrative analysis of different data types and it eliminates the difficulty of choosing an appropriate cluster number in clustering analysis. It provides a statistical learning tool to globally analyze the oligomerization state of a system of protein complexes.</div><div><br></div><div><br></div><div>The second part examines global hypothesis testing under sparse alternatives and arbitrarily strong dependence. Global tests are used to aggregate information and reduce the burden of multiple testing. A common situation in modern data analysis is that variables with nonzero effects are sparse. The minimum p-value and higher criticism tests are particularly effective and more powerful than the F test under sparse alternatives. This is the common setting in genome-wide association study (GWAS) data. However, arbitrarily strong dependence among variables poses a great challenge towards the p-value calculation of these optimal tests. We develop a latent variable adjusted method to correct minimum p-value test. After adjustment, test statistics become weakly dependent and the corresponding null distributions are valid. We show that if the latent variable is not related to the response variable, power can be improved. Simulation studies show that our method is more powerful than other methods in highly sparse signal and correlated marginal tests setting. We also show its application in a real dataset.</div> Statistics Applied Statistics Biostatistics Statistical Theory mass spectrometry proteomic methods Global test Genome-Wide Association Study Latent Factor Modeling Correlation structures
236	Real-Time Dengue Forecasting In Thailand: A Comparison Of Penalized Regression Approaches Using Internet Search Data Kusiak, Caroline 25 October 2018 (has links) Dengue fever affects over 390 million people annually worldwide and is of particu- lar concern in Southeast Asia where it is one of the leading causes of hospitalization. Modeling trends in dengue occurrence can provide valuable information to Public Health officials, however many challenges arise depending on the data available. In Thailand, reporting of dengue cases is often delayed by more than 6 weeks, and a small fraction of cases may not be reported until over 11 months after they occurred. This study shows that incorporating data on Google Search trends can improve dis- ease predictions in settings with severely underreported data. We compare penalized regression approaches to seasonal baseline models and illustrate that incorporation of search data can improve prediction error. This builds on previous research show- ing that search data and recent surveillance data together can be used to create accurate forecasts for diseases such as influenza and dengue fever. This work shows that even in settings where timely surveillance data is not available, using search data in real-time can produce more accurate short-term forecasts than a seasonal baseline prediction. However, forecast accuracy degrades the further into the future the forecasts go. The relative accuracy of these forecasts compared to a seasonal average forecast varies depending on location. Overall, these data and models can improve short-term public health situational awareness and should be incorporated into larger real-time forecasting efforts. Forecasting Dengue Penalized Google LASSO Prediction Applied Statistics Bioinformatics Biostatistics Disease Modeling Other Mathematics Statistical Methodology Statistical Models Survival Analysis Vital and Health Statistics
237	Méthodes de simulation stochastique pour le traitement de l’information / Stochastic simulation methods for information processing Minvielle-Larrousse, Pierre 05 March 2019 (has links) Lorsqu’une grandeur d’intérêt ne peut être directement mesurée, il est fréquent de procéder à l’observation d’autres quantités qui lui sont liées par des lois physiques. Ces quantités peuvent contenir de l’information sur la grandeur d’intérêt si l’on sait résoudre le problème inverse, souvent mal posé, et inférer la valeur. L’inférence bayésienne constitue un outil statistique puissant pour l’inversion, qui requiert le calcul d’intégrales en grande dimension. Les méthodes Monte Carlo séquentielles (SMC), aussi dénommées méthodes particulaires, sont une classe de méthodes Monte Carlo permettant d’échantillonner selon une séquence de densités de probabilité de dimension croissante. Il existe de nombreuses applications, que ce soit en filtrage, en optimisation globale ou en simulation d’évènement rare. Les travaux ont porté notamment sur l’extension des méthodes SMC dans un contexte dynamique où le système, régi par un processus de Markov caché, est aussi déterminé par des paramètres statiques que l’on cherche à estimer. En estimation bayésienne séquentielle, la détermination de paramètres fixes provoque des difficultés particulières : un tel processus est non-ergodique, le système n’oubliant pas ses conditions initiales. Il est montré comment il est possible de surmonter ces difficultés dans une application de poursuite et identification de formes géométriques par caméra numérique CCD. Des étapes d’échantillonnage MCMC (Chaîne de Markov Monte Carlo) sont introduites pour diversifier les échantillons sans altérer la distribution a posteriori. Pour une autre application de contrôle de matériau, qui cette fois « hors ligne » mêle paramètres statiques et dynamiques, on a proposé une approche originale. Elle consiste en un algorithme PMMH (Particle Marginal Metropolis-Hastings) intégrant des traitements SMC Rao-Blackwellisés, basés sur des filtres de Kalman d’ensemble en interaction.D’autres travaux en traitement de l’information ont été menés, que ce soit en filtrage particulaire pour la poursuite d’un véhicule en phase de rentrée atmosphérique, en imagerie radar 3D par régularisation parcimonieuse ou en recalage d’image par information mutuelle. / When a quantity of interest is not directly observed, it is usual to observe other quantities that are linked by physical laws. They can provide information about the quantity of interest if it is able to solve the inverse problem, often ill posed, and infer the value. Bayesian inference is a powerful tool for inversion that requires the computation of high dimensional integrals. Sequential Monte Carlo (SMC) methods, a.k.a. interacting particles methods, are a type of Monte Carlo methods that are able to sample from a sequence of probability densities of growing dimension. They are many applications, for instance in filtering, in global optimization or rare event simulation.The work has focused in particular on the extension of SMC methods in a dynamic context where the system, governed by a hidden Markov process, is also determined by static parameters that we seek to estimate. In sequential Bayesian estimation, the determination of fixed parameters causes particular difficulties: such a process is non-ergodic, the system not forgetting its initial conditions. It is shown how it is possible to overcome these difficulties in an application of tracking and identification of geometric shapes by CCD digital camera. Markov Monte Carlo Chain (MCMC) sampling steps are introduced to diversify the samples without altering the posterior distribution. For another material control application, which mixes static and dynamic parameters, we proposed an original offline approach. It consists of a Particle Marginal Metropolis-Hastings (PMMH) algorithm that integrates Rao-Blackwellized SMC, based on a bank of interacting Ensemble Kalman filters.Other information processing works has been conducted: particle filtering for atmospheric reentry vehicle tracking, 3D radar imaging by sparse regularization and image registration by mutual information. Statistiques appliquées Problème inverse Monte Carlo séquentiel Chaîne de Markov Monte Carlo Applied statistics Inverse problem Sequential Monte Carlo Markov Chain Monte Carlo 510
238	Kunskap, motivation och stöd : En studie om kompetensutveckling inom statistisk analys av experimentella stöd / Knowledge, Motivation and Support : A study about competence development in statistical analysis of experimental data Pettersson, Jennifer January 2014 (has links) Testing and analysis of measurement data is a major part of the workday for development engineers within the section Fluid and Emission Treatment to assure the quality of products developed by Scania CV. The ineluctable variation and therefore uncertainty of results can be quantified and discussed with the use of statistical methods. The overall objective of this study is to examine how the competence of development engineers can be improved within the area of statistics. This has been done by relating the concept of competence with the use of statistical methods to analyse experimental data and by investigating which factors that influence the competence of the engineers. Observations of the daily operations, interviews and a focus group with development engineers and a workshop with management personnel, has been used as research methods in this study. The results show that the desired competence, from the perspective of the individual, consists of knowledge, motivation and support. It is important that none of these parts are neglected when the competence of engineers is to be strengthened. The demand of statistical analysis by management and decision makers is a factor that has been shown to have great influence on the competence of development engineers. The competence of development engineers is also influenced by which tools and other types of support the organisation offers. The conclusion is that competence development is more comprehensive than a one-time-effort, as for example a short course, since the competence needs to be maintained and develop as the business evolves. A continuous support from the organisation would probably contribute to a more active development of competence. Knowledge sharing between colleagues on internal digital platforms has great potential and could be a component of a continuous support for the development engineers. / Provtagningar och analys av mätdata är en stor del av utvecklingsingenjörernas arbetsvardag inom sektionen Fluid and Emission Treatment för att säkerställa kvalitén på de produkter som Scania CV AB utvecklar. Den ofrånkomliga variationen och därmed osäkerheten i resultat kan kvantifieras och diskuteras med statistiska metoder. Syftet med denna studie är att undersöka hur utvecklingsingenjörernas kompetens inom statistikämnet kan förstärkas. Detta har gjorts genom att relatera kompetensbegreppet till användningen av statistiska metoder för att analysera experimentella data samt utreda vilka faktorer som påverkar ingenjörens kompetens. Observationer av verksamheten, intervjuer och en fokusgrupp med utvecklingsingenjörer samt en workshop med personer i ledande befattning har använts som forskningsmetod i studien. Resultatet visar att den önskade kompetensen, ur individens perspektiv, består av tre delar, kunskap, motivation och stöd. Det är viktigt att ingen av dessa delar bortses ifrån när ingenjörernas kompetens ska utvecklas. Efterfrågan av statistiska analyser från chefer och beslutsfattare är en faktor som har visat sig ha stor påverkan på utvecklingsingenjörernas kompetens. Vilka hjälpmedel och övriga stöd som erbjuds från organisationen har även stor inverkan på kompetensen. Slutsatsen är att kompetensutvecklingen är mer omfattande än en punktinsats, som exempelvis en kort kurs, eftersom kompetensen måste underhållas samt utvecklas vartefter verksamheten utvecklas. Ett kontinuerligt stöd från organisationen bidrar sannolikt till en mer aktiv kompetensutveckling. Kunskapsdelning mellan kollegor via interna digitala plattformar har stor potential och skulle kunna utgöra en del av ett kontinuerligt stöd för utvecklingsingenjörerna. Applied statistics in engineering Analysis of experimental data Competence Competence development Learning in everyday work Tillämpad statistisk i ingenjörsarbete Analys av experimentella data Kompetens Kompetensutveckling Lärande i arbetsvardagen Learning Lärande
239	Dynamic Model Pooling Methodology for Improving Aberration Detection Algorithms Sellati, Brenton J 01 January 2010 (has links) (PDF) Syndromic surveillance is defined generally as the collection and statistical analysis of data which are believed to be leading indicators for the presence of deleterious activities developing within a system. Conceptually, syndromic surveillance can be applied to any discipline in which it is important to know when external influences manifest themselves in a system by forcing it to depart from its baseline. Comparing syndromic surveillance systems have led to mixed results, where models that dominate in one performance metric are often sorely deficient in another. This results in a zero-sum trade off where one performance metric must be afforded greater importance for a decision to be made. This thesis presents a dynamic pooling technique which allows for the combination of competing syndromic surveillance models in such a way that the resulting detection algorithm offers a superior combination of sensitivity and specificity, two of the key model metrics, than any of the models individually. We then apply this methodology to a simulated data set in the context of detecting outbreaks of disease in an animal population. We find that this dynamic pooling methodology is robust in the sense that it is capable of superior overall performance with respect to sensitivity, specificity, and mean time to detection under varying conditions of baseline data behavior, e.g. controlling for the presence or absence of various levels of trend and seasonality, as well as in simulated out-of-sample performance tests. LOESS disease algorithm syndromic surveillance health Agricultural and Resource Economics Applied Statistics Other Statistics and Probability Statistical Methodology Statistical Models
240	The development of authentic virtual reality scenarios to measure individuals’ level of systems thinking skills and learning abilities Dayarathna, Vidanelage L. 10 December 2021 (has links) (PDF) This dissertation develops virtual reality modules to capture individuals’ learning abilities and systems thinking skills in dynamic environments. In the first chapter, an immersive queuing theory teaching module is developed using virtual reality technology. The objective of the study is to present systems engineering concepts in a more sophisticated environment and measure students learning abilities. Furthermore, the study explores the performance gaps between male and female students in manufacturing systems concepts. To investigate the gender biases toward the performance of developed VR module, three efficacy measures (simulation sickness questionnaire, systems usability scale, and presence questionnaire) and two effectiveness measures (NASA TLX assessment and post-motivation questionnaire) were used. The second and third chapter aims to assess individuals’ systems thinking skills when they engage in complex multidimensional problems. A modern complex system comprises many interrelated subsystems and various dynamic attributes. Understanding and handling large complex problems requires holistic critical thinkers in modern workplaces. Systems Thinking (ST) is an interdisciplinary domain that offers different ways to better understand the behavior and structure of a complex system. The developed scenario-based instrument measures students’ cognitive tendency for complexity, change, and interaction when making decisions in a turbulent environment. The proposed complex systems scenarios are developed based on an established systems thinking instrument that can measure important aspects of systems thinking skills. The systems scenarios are built in a virtual environment that facilitate students to react to real-world situations and make decisions. The construct validity of the VR scenarios is assessed by comparing the high systematic scores between ST instrument and developed VR scenarios. Furthermore, the efficacy of the VR scenarios is investigated using the simulation sickness questionnaire, systems usability scale, presence questionnaire, and NASA TLX assessment. Engineering education Learning abilities Scenario based assessment Systems thinking Systems thinking skills Virtual reality Applied Statistics Categorical Data Analysis Ergonomics Industrial Engineering Systems Engineering

Search results