Global ETD Search

1	Statistical quality assurance of IGUM : Statistical quality assurance and validation of IGUM in a steady and dynamic gas flow prior to proof of concept Kornsäter, Elin, Kallenberg, Dagmar January 2022 (has links) To further support and optimise the production of diving tables for the Armed Forces of Sweden, a research team has developed a new machine called IGUM (Inert Gas UndersökningsMaskin) which aims to measure how inert gas is taken up and exhaled. Due to the new design of machine, the goal of this thesis was to statistically validate its accuracy and verify its reliability. In the first stage, a quality assurance of the linear position conversion key of IGUM in a steady and known gas flow was conducted. This was done by collecting and analysing data in 29 experiments followed by examination with ordinary least squares, hypothesis testing, analysis of variance, bootstrapping and Bayesian hierarchical modelling. Autocorrelation among the residuals were detected but concluded to not have an impact on the results due to the bootstrap analysis. The results showed an estimated conversion key equal to 1.276 ml/linear position which was statistically significant for all 29 experiments. In the second stage, it was examined if and how well IGUM could detect small additions of gas in a dynamic flow. The breathing machine ANSTI was used to simulate the sinus pattern of a breathing human in 24 experiments where 3 additions of 30 ml of gas manually was added into the system. The results were analysed through sinusoidal regression where three dummy variables represented the three additions of gas in each experiment. To examine if IGUM detects 30 ml for each input, the previously statistically proven conversion key at 1.276ml/linear position was used. An attempt was made to remove the seasonal trend in the data, something that was not completely successful which could influence the estimations. The results showed that IGUM indeed can detect these small gas additions, where the amount detected showed some differences between dummies and experiments. This is most likely since not enough trend has been removed, rather than IGUM not working properly. Statistic Bayesian hierarchical modelling sinusoidal regression ordinary least squares hypothesis testing analysis of variance Probability Theory and Statistics Sannolikhetsteori och statistik
2	Abundância e distribuiçãoda baleia jubarte (Megaptera novaeangliae) na costa do Brasil Julião, Heloise Pavanato January 2013 (has links) Dissertação(mestrado) - Universidade Federal do Rio Grande, Programa de Pós–Graduação em Oceanografia Biológica, Instituto de Oceanografia, 2013. / Submitted by Cristiane Gomides (cristiane_gomides@hotmail.com) on 2013-10-09T18:43:46Z No. of bitstreams: 1 Heloise.pdf: 1525937 bytes, checksum: 44441e69ced9544eaba26ec6b8f8e2d9 (MD5) / Approved for entry into archive by Sabrina Andrade (sabrinabeatriz@ibest.com.br) on 2013-10-17T03:12:06Z (GMT) No. of bitstreams: 1 Heloise.pdf: 1525937 bytes, checksum: 44441e69ced9544eaba26ec6b8f8e2d9 (MD5) / Made available in DSpace on 2013-10-17T03:12:06Z (GMT). No. of bitstreams: 1 Heloise.pdf: 1525937 bytes, checksum: 44441e69ced9544eaba26ec6b8f8e2d9 (MD5) Previous issue date: 2013 / População é a unidade fundamental da conservação e sua forma mais simples de monitoramento envolve a amostragem temporal regular para a determinação do status populacional. Uma das populações de baleia jubarte do Hemisfério Sul utiliza a costa do Brasil entre maio e dezembro para se reprodução e criação dos filhotes. Esta população, denominada “estoque reprodutivo A” pela Comissão Internacional da Baleia, tem mostrado sinais de recuperação após um marcado declínio devido a caça e um longo período de moratória. Esta população se concentra principalmente no Banco dos Abrolhos (BA), onde águas calmas e quentes parecem constituir um hábitat ideal. Este estudo teve o objetivo de estimar o tamanho da população de jubartes para o ano de 2011, bem como predizer a distribuição de grupos na costa brasileira. O método de amostragem de distâncias foi implementado, e modelos hierárquicos Bayesianos foram propostos para estimar a abundância. Modelos auto-regressivos condicionais foram aplicados para predizer a densidade em células de 0.5° de latitude e longitude. O tamanho da população foi estimado em 10,160 baleias (Cr.I.95%=6,607-17,692). As maiores densidades foram encontradas entre o Banco dos Abrolhos e a Baía de Todos os Santos (BA). Os resultados sugerem que o aumento populacional acarreta a expansão da população para além do Banco dos Abrolhos. / Population is the fundamental unit of conservation and its simplest monitoring tool involves regular sampling over time for population assessing status. One of the Southern Hemisphere humpback whale populations winters at the Brazilian coast typically from May to December where breeding and calving occur. This population, labeled as “breeding stock A” by International Whaling Commission, has shown signs of recovery after the long period of whaling. The goal of this study was to estimate the population size of humpback whales up to 2011, and predict group distribution along the Brazilian coast. Distance sampling methods were implemented and hierarchical Bayesian models were proposed to estimate abundance. Conditional auto-regressive models were used to predict the density in a lattice of 0.5° of latitude and longitude. Population size was estimated at 10,160 whales (Cr.I.95%=6,607-17,692). Highest densities were predicted to occur between Abrolhos Bank and Todos os Santos Bay (BA). The results suggest that the population increase leads to a population expansion beyond Abrolhos Bank. Megaptera novaeangliae Estoque reprodutivo A Tamanho populacional Densidade Ocorrência Modelos preditivos Modelagem hierárquica bayesiana Variáveis oceanográficas Breeding stock A, Population size Density Occurrence Predictive models Bayesian hierarchical modelling Oceanographic variables
3	Dynamic Bayesian models for modelling environmental space-time fields Dou, Yiping 05 1900 (has links) This thesis addresses spatial interpolation and temporal prediction using air pollution data by several space-time modelling approaches. Firstly, we implement the dynamic linear modelling (DLM) approach in spatial interpolation and find various potential problems with that approach. We develop software to implement our approach. Secondly, we implement a Bayesian spatial prediction (BSP) approach to model spatio-temporal ground-level ozone fields and compare the accuracy of that approach with that of the DLM. Thirdly, we develop a Bayesian version empirical orthogonal function (EOF) method to incorporate the uncertainties due to temporally varying spatial process, and the spatial variations at broad- and fine- scale. Finally, we extend the BSP into the DLM framework to develop a unified Bayesian spatio-temporal model for univariate and multivariate responses. The result generalizes a number of current approaches in this field. Bayesian hierarchical modelling Kalman filter Dynamic linear modelling Bayesian spatial prediction MCMC algorithm Gibbs sampling Wishart distributions Forward filtering Backward sampling Bayesian spatial prediction Bayesian empirical orthogonal functions Bayesian spatial prediction methods
4	Dynamic Bayesian models for modelling environmental space-time fields Dou, Yiping 05 1900 (has links) This thesis addresses spatial interpolation and temporal prediction using air pollution data by several space-time modelling approaches. Firstly, we implement the dynamic linear modelling (DLM) approach in spatial interpolation and find various potential problems with that approach. We develop software to implement our approach. Secondly, we implement a Bayesian spatial prediction (BSP) approach to model spatio-temporal ground-level ozone fields and compare the accuracy of that approach with that of the DLM. Thirdly, we develop a Bayesian version empirical orthogonal function (EOF) method to incorporate the uncertainties due to temporally varying spatial process, and the spatial variations at broad- and fine- scale. Finally, we extend the BSP into the DLM framework to develop a unified Bayesian spatio-temporal model for univariate and multivariate responses. The result generalizes a number of current approaches in this field. Bayesian hierarchical modelling Kalman filter Dynamic linear modelling Bayesian spatial prediction MCMC algorithm Gibbs sampling Wishart distributions Forward filtering Backward sampling Bayesian spatial prediction Bayesian empirical orthogonal functions Bayesian spatial prediction methods
5	Dynamic Bayesian models for modelling environmental space-time fields Dou, Yiping 05 1900 (has links) This thesis addresses spatial interpolation and temporal prediction using air pollution data by several space-time modelling approaches. Firstly, we implement the dynamic linear modelling (DLM) approach in spatial interpolation and find various potential problems with that approach. We develop software to implement our approach. Secondly, we implement a Bayesian spatial prediction (BSP) approach to model spatio-temporal ground-level ozone fields and compare the accuracy of that approach with that of the DLM. Thirdly, we develop a Bayesian version empirical orthogonal function (EOF) method to incorporate the uncertainties due to temporally varying spatial process, and the spatial variations at broad- and fine- scale. Finally, we extend the BSP into the DLM framework to develop a unified Bayesian spatio-temporal model for univariate and multivariate responses. The result generalizes a number of current approaches in this field. / Science, Faculty of / Statistics, Department of / Graduate Bayesian hierarchical modelling Kalman filter Dynamic linear modelling Bayesian spatial prediction MCMC algorithm Gibbs sampling Wishart distributions Forward filtering Backward sampling Bayesian spatial prediction Bayesian empirical orthogonal functions Bayesian spatial prediction methods
6	Machine Learning methods in shotgun proteomics Truong, Patrick January 2023 (has links) As high-throughput biology experiments generate increasing amounts of data, the field is naturally turning to data-driven methods for the analysis and extraction of novel insights. These insights into biological systems are crucial for understanding disease progression, drug targets, treatment development, and diagnostics methods, ultimately leading to improving human health and well-being, as well as, deeper insight into cellular biology. Biological data sources such as the genome, transcriptome, proteome, metabolome, and metagenome provide critical information about biological system structure, function, and dynamics. The focus of this licentiate thesis is on proteomics, the study of proteins, which is a natural starting point for understanding biological functions as proteins are crucial functional components of cells. Proteins play a crucial role in enzymatic reactions, structural support, transport, storage, cell signaling, and immune system function. In addition, proteomics has vast data repositories and technical and methodological improvements are continually being made to yield even more data. However, generating proteomic data involves multiple steps, which are prone to errors, making sophisticated models essential to handle technical and biological artifacts and account for uncertainty in the data. In this licentiate thesis, the use of machine learning and probabilistic methods to extract information from mass-spectrometry-based proteomic data is investigated. The thesis starts with an introduction to proteomics, including a basic biological background, followed by a description of how massspectrometry-based proteomics experiments are performed, and challenges in proteomic data analysis. The statistics of proteomic data analysis are also explored, and state-of-the-art software and tools related to each step of the proteomics data analysis pipeline are presented. The thesis concludes with a discussion of future work and the presentation of two original research works. The first research work focuses on adapting Triqler, a probabilistic graphical model for protein quantification developed for data-dependent acquisition (DDA) data, to data-independent acquisition (DIA) data. Challenges in this study included verifying that DIA data conformed with the model used in Triqler, addressing benchmarking issues, and modifying the missing value model used by Triqler to adapt for DIA data. The study showed that DIA data conformed with the properties required by Triqler, implemented a protein inference harmonization strategy, and modified the missing value model to adapt for DIA data. The study concluded by showing that Triqler outperformed current protein quantification techniques. The second research work focused on developing a novel deep-learning based MS2-intensity predictor by incorporating the self-attention mechanism called transformer into Prosit, an established Recurrent Neural Networks (RNN) based deep learning framework for MS2 spectrum intensity prediction. RNNs are a type of neural network that can efficiently process sequential data by capturing information from previous steps, in a sequential manner. The transformer self-attention mechanism allows a model to focus on different parts of its input sequence during processing independently, enabling it to capture dependencies and relationships between elements more effectively. The transformers therefore remedy some of the drawbacks of RNNs, as such, we hypothesized that the implementation of MS2-intensity predictor using transformers rather than RNN would improve its performance. Hence, Prosit-transformer was developed, and the study showed that the model training time and the similarity between the predicted MS2 spectrum and the observed spectrum improved. These original research works address various challenges in computational proteomics and contribute to the development of data-driven life science. / Allteftersom high-throughput experiment genererar allt större mängder data vänder sig området naturligt till data-drivna metoder för analys och extrahering av nya insikter. Dessa insikter om biologiska system är avgörande för att förstå sjukdomsprogression, läkemedelspåverkan, behandlingsutveckling, och diagnostiska metoder, vilket i slutändan leder till en förbättring av människors hälsa och välbefinnande, såväl som en djupare förståelse av cell biologi. Biologiska datakällor som genomet, transkriptomet, proteomet, metabolomet och metagenomet ger kritisk information om biologiska systems struktur, funktion och dynamik. I licentiatuppsats fokusområde ligger på proteomik, studiet av proteiner, vilket är en naturlig startpunkt för att förstå biologiska funktioner eftersom proteiner är avgörande funktionella komponenter i celler. Dessa proteiner spelar en avgörande roll i enzymatiska reaktioner, strukturellt stöd, transport, lagring, cellsignalering och immunsystemfunktion. Dessutom har proteomik har stora dataarkiv och tekniska samt metodologiska förbättringar görs kontinuerligt för att ge ännu mer data. Men för att generera proteomisk data krävs flera steg, som är felbenägna, vilket gör att sofistikerade modeller är väsentliga för att hantera tekniska och biologiska artefakter och för att ta hänsyn till osäkerhet i data. I denna licentiatuppsats undersöks användningen av maskininlärning och probabilistiska metoder för att extrahera information från masspektrometribaserade proteomikdata. Avhandlingen börjar med en introduktion till proteomik, inklusive en grundläggande biologisk bakgrund, följt av en beskrivning av hur masspektrometri-baserade proteomikexperiment utförs och utmaningar i proteomisk dataanalys. Statistiska metoder för proteomisk dataanalys utforskas också, och state-of-the-art mjukvara och verktyg som är relaterade till varje steg i proteomikdataanalyspipelinen presenteras. Avhandlingen avslutas med en diskussion om framtida arbete och presentationen av två original forskningsarbeten. Det första forskningsarbetet fokuserar på att anpassa Triqler, en probabilistisk grafisk modell för proteinkvantifiering som utvecklats för datadependent acquisition (DDA) data, till data-independent acquisition (DIA) data. Utmaningarna i denna studie inkluderade att verifiera att DIA-datas egenskaper överensstämde med modellen som användes i Triqler, att hantera benchmarking-frågor och att modifiera missing-value modellen som användes av Triqler till DIA-data. Studien visade att DIA-data överensstämde med de egenskaper som krävdes av Triqler, implementerade en proteininferensharmoniseringsstrategi och modifierade missing-value modellen till DIA-data. Studien avslutades med att visa att Triqler överträffade nuvarande state-of-the-art proteinkvantifieringsmetoder. Det andra forskningsarbetet fokuserade på utvecklingen av en djupinlärningsbaserad MS2-intensitetsprediktor genom att inkorporera self-attention mekanismen som kallas för transformer till Prosit, en etablerad Recurrent Neural Network (RNN) baserad djupinlärningsramverk för MS2 spektrum intensitetsprediktion. RNN är en typ av neurala nätverk som effektivt kan bearbeta sekventiell data genom att bevara och använda dolda tillstånd som fångar information från tidigare steg på ett sekventiellt sätt. Självuppmärksamhetsmekanismen i transformer tillåter modellen att fokusera på olika delar av sekventiellt data samtidigt under bearbetningen oberoende av varandra, vilket gör det möjligt att fånga relationer mellan elementen mer effektivt. Genom detta lyckas Transformer åtgärda vissa nackdelar med RNN, och därför hypotiserade vi att en implementation av en ny MS2-intensitetprediktor med transformers istället för RNN skulle förbättra prestandan. Därmed konstruerades Prosit-transformer, och studien visade att både modellträningstiden och likheten mellan predicerat MS2-spektrum och observerat spektrum förbättrades. Dessa originalforskningsarbeten hanterar olika utmaningar inom beräkningsproteomik och bidrar till utvecklingen av datadriven livsvetenskap. / <p>QC 2023-05-22</p> benchmark mathematical methods transformers computational proteomics proteomics bioinformatics bert ms2 intensity probabilistic modelling Bioinformatics (Computational Biology) Bioinformatik (beräkningsbiologi)

1

Page generated in 0.4292 seconds