Spelling suggestions: "subject:"bioinformatics (computational axiology)"" "subject:"bioinformatics (computational audiology)""
121 |
Efficient Parameter Inference for Stochastic Chemical KineticsPAUL, DEBDAS January 2014 (has links)
Parameter inference for stochastic systems is considered as one of the fundamental classical problems in the domain of computational systems biology. The problem becomes challenging and often analytically intractable with the large number of uncertain parameters. In this scenario, Markov Chain Monte Carlo (MCMC) algorithms have been proved to be highly effective. For a stochastic system, the most accurate description of the kinetics is given by the Chemical Master Equation (CME). Unfortunately, analytical solution of CME is often intractable even for considerably small amount of chemically reacting species due to its super exponential state space complexity. As a solution, Stochastic Simulation Algorithm (SSA) using Monte Carlo approach was introduced to simulate the chemical process defined by the CME. SSA is an exact stochastic method to simulate CME but it also suffers from high time complexity due to simulation of every reaction. Therefore computation of likelihood function (based on exact CME) in MCMC becomes expensive which alternately makes the rejection step expensive. In this generic work, we introduce different approximations of CME as a pre-conditioning step to the full MCMC to make rejection cheaper. The goal is to avoid expensive computation of exact CME as far as possible. We show that, with effective pre-conditioning scheme, one can save a considerable amount of exact CME computations maintaining similar convergence characteristics. Additionally, we investigate three different sampling schemes (dense sampling, longer sampling and i.i.d sampling) under which convergence for MCMC using exact CME for parameter estimation can be analyzed. We find that under i.i.d sampling scheme, better convergence can be achieved than that of dense sampling of the same process or sampling the same process for longer time. We verify our theoretical findings for two different processes: linear birth-death and dimerization.Apart from providing a framework for parameter inference using CME, this work also provides us the reasons behind avoiding CME (in general) as a parameter estimation technique for so long years after its formulation
|
122 |
Phylogenetic analysis of secretion systems in Francisellaceae and Legionellales : Investigating events of intracellularizationNyrén, Karl January 2021 (has links)
Host-adapted bacteria are pathogens that, through evolutionary time and host-adaptive events, acquired the ability to manipulate hosts into assisting their own reproduction and spread. Through these host-adaptive events, free-living pathogens may be rendered unable to reproduce without their host, which is an irreversible step in evolution. Francisellaceae and Legionellales, two orders of Gammaproteobacteria, are cases where host-adaptation has lead to an intracellular lifestyle. Both orders use secretion systems, in combination with effector proteins, to invade and control their hosts. A current view is that Francisellaceae and Legionellales went through host-adaptive events at two separate time points. However, F. hongkongensis, a member of Francisellaceae shares the same secretion system as the order of Legionellales. Additionally, two host-adapted Gammaproteobacteria, Piscirickettsia spp. and Berkiella spp., swaps phylogenetic positions between Legionellales and Francisellaceae depending on methods applied - indicating shared features of Francisellaceae and Legionellales. In this study, we set up a workflow to screen public metagenomic data for candidate host-adaptive bacteria. Using this data, we attempted to assert the phylogenetic position and possibly resolve evolutionary events that occurred in Legionellales, F. hongkongensis, Francisellaceae, Piscirickettsia spp. and Berkiella spp. We successfully acquired 23 candidate host-adapted MAGs by (i) scanning for genes, among reads before assembly, using PhyloMagnet, and (ii) screening for complete secretion systems with MacSyFinder. The phylogenetic results turned out indecisive in the placement ofBerkiella spp. and Piscirickettsia. However, results found in this study indicate that, contrary to previous beliefs, it is possible that it was one intracellularization event of a common ancestor that gave rise to the intracellular lifestyle of Francisellaceae and Legionellales.
|
123 |
Clustering approaches for extracting structural determinants of enzyme active sitesStamatelou, Ismini - Christina January 2020 (has links)
The study of enzyme binding sites is an essential but rather demanding process of increased complexity since the amino acids lining these areas are not rigid. At the same time, the minimization of side effects and the specificity of new ligands is a great challenge in the structure-based drug design approach. Using glycogen phosphorylase - a validated target for the development of new antidiabetic agents - as a case study, this project focuses on the examination of side-chain conformations of amino acids that play a key role in the catalytic site of the enzyme. Specifically, different rotamers of each amino acid were collected to build a dataset of different conformations of the catalytic site. The rotamers were filtered by their probability of occurrence and subsequently, all rotamers that create steric clashes were rejected. Then, these conformations were clustered based on their similarity. Three different clustering algorithms and multiple numbers of clusters were tested using the silhouette scores evaluation for the clustering process. In order to measure the similarity, the Euclidean metric was used which due to the correspondence of the coordinates between the conformations was very similar to the cRMSD metric. Two-level clustering was applied to the dataset for more in-depth observations. According to the clustering results, specific aminoacids with major geometrical variations in their rotamers play the most important role in the separation of the clusters. Additionally, all rotamers of an amino acid can be grouped based on their structure, something that was confirmed using “Chimera” software as a visualization tool. To this end, the ultimate aim of this study is to examine whether the clustering of conformations produces clusters with points geometrically similar to each other, in order to identify near neighbors, i.e. conformations that are quite similar in structure but do not play a determinant role in the function and those that are quite diverse and could be further exploited.
|
124 |
Deep Learning Models for Profiling of Kinase InhibitorsEriksson, Linnea January 2020 (has links)
With the advent of fluorescence microscopy and image analysis, quantitative information from images can be extracted and changes in cell morphology can be studied. Microscopy-based morphological profiling assays with multiplexed fluorescent dyes, like Cell Painting, can be used for this purpose. It has been shown that morphological profiles can be used to train AI models to classify images into different biological mechanisms. Hence, the goal of this project was to study the possibilities for Deep Learning models and Convolutional Neural Networks to distinguish between different classes of kinase inhibitors based on their morphological profiles. Three different Convolutional Neural Network architectures were used: ResNet50, MobileNetV2, and VGG16. They were trained with two different inputs and two different optimisers: Adam and SGD. Also, a comparison between the performances with and without Transfer Learning through ImageNet weights was executed. The results indicate that MobileNetV2 with Adam as an optimiser performed the best, with a micro average of 0.93 and higher ROC areas compared to the other models. The study also highlighted the importance of utilizing Transfer Learning.
|
125 |
Machine learning for automatic grading of knee osteoarthritis from X-ray radiographsSiggstedt, Ellen January 2023 (has links)
Knee osteoarthritis is a growing problem due to increasing risk factors such as age and obesity. It is a common task for a radiologist to grade osteoarthritis in three compartments (medial tibiofemoral (MTF), lateral tibiofemoral (LTF) and patellofemoral (PF)) in a knee from different image views of X-ray images, to decide if osteoarthritis is the cause of pain for the patient. Reasons for automating this process are to decrease subjectivity, time for diagnosis and reduce workload for radiologists. The aim with this project was to grade osteoarthritis using machine learning by training convolutional neural networks on around 5000 double annotated examinations by radiologists and one orthopaedic surgeon at Nyköping Hospital. Different methods were evaluated and the models were then optimised with hyperparameter tuning. The aim with the project is to contribute to a future software that could be tested at Nyköping Hospital. The project found that using transfer learning with DenseNet for MTF and PF, and using a MTF model as transfer learning model for the LTF model was the best performing transfer learning networks to use. Also, cropping the images around the region of interest for MTF and LTF improved the models. The best method to make predictions from the model outputs appeared to be to train a model on a merged set of training- and validation data for making predictions. Comparisons of final models with the radiologist initial annotations showed that the MTF and LTF models give fewer misclassifications of more than one grade, if compared to the disagreements of more than one grade by the two radiologists. While for the PF model the radiologists still have an advantage and more data is probably needed for both the PF model and the LTF model since grade 0 is very overrepresented for those grades.
|
126 |
Text Mining Methods for Biomedical Data Analysis / Text Mining Metoder för Biomedicinsk Data AnalysJabeen, Rakhshanda January 2021 (has links)
Biological data topic modeling has become a very prevalent topic among researchers in recent times. However, analysing countless research papers and gathering consensus regarding biomedicine is a near-impossible task for any researcher due to the complexity and quantity of material that is published. This thesis is devised to focus on two objectives that can help the researchers in this domain based on data related to five major DNA repair pathways. The first objective is to propose an unsupervised approach to examine the hidden structures and analyse research trends in temporal biomedical text data. The second objective is to find DNA repair markers involved in immune defense and retrieve potential PPIs, GIs, and disease-gene associations reported in the literature. We have used latent Dirichlet Allocation (LDA) to discover hidden themes and semantically coherent topics from text. We have clustered the documents based on LDA topic models to analyse the research trend and used the Mann- Kendall test to understand the trends of the topics. Hybridization of text mining methods with classical co-occurrence statistical approach and association rule mining was used to discover potential PPIs, GIs, and disease-gene association in the text. The results for PPIs and GIs were then evaluated with an external biological database of PPIs.
|
127 |
Diagnosing intraventricular hemorrhage from brain ultrasound images using machine learningDalla Santa, Chiara January 2023 (has links)
No description available.
|
128 |
CRISPR-Drawr, a tool to design mutagenic primerTorbjörn, Larsson January 2023 (has links)
Short open reading frames (sORFs) are codon sequences with a start and stop codon within atmost 100 codons. Cells produce many transcripts from them and some sORFs have been found to have function. sORFs have been associated with embryogenesis, myogenesis, immunity and various diseases including cancers. Cell culture screening is a common method to study function in sORFs. By inserting mutations in known sORF locations one can affect their translation by removing start codons, inserting premature stop codons, or removing native stop codons. A new tool set to do this isCRISPR technology, where single guide RNA (gRNA) can be used to make more precise genome edits. Unfortunately, such design is nontrivial and suggests a lot of variants for testing. It results in a back-and-forth testing process involving different available design tools. In this project, a comprehensive way was developed to see and iterate over the many test combinations. This intends to ease the process and decrease the likelihood for errors. The developed solution is a tool that integrates the currently best design tools. It also introduces a method in the form of a new quality summary score that can evaluate the estimated outcomes of the various designed guide variants. The tool was tested, and it was found that the score simplifies and amplifies the earlier usedscore methods. The pipeline is simple to install and use, integrates the currently most actively developed tools, and an installation is as future proof as can be made in a rapidly evolving field.
|
129 |
Characterization of the evolution of satellite DNA across PasseriformesMartins Borges, Inês January 2022 (has links)
Satellite DNA (satDNA) is among the fastest evolving elements in the genome and is highly abundant in some eukaryotic genomes. Its highly repetitive nature means it is challenging to assemble, and thus underrepresented in most assemblies and often understudied as a result. Birds are an ideal model organism for the study of satDNA and its evolution, since the large amount of available sequenced genomes of this clade allows for dense sampling across various evolutionary timescales, and the low number of satDNA families within their satellitomes facilitates their study and comparison between species. Here, we characterize satDNA and its evolution across Passeriformes, an avian clade containing two-thirds of all bird species spanning ~50 million years of evolution. With this goal we use both short-read data and long-read assemblies of species representative of over 30 passerine families in this clade to shed light on the evolution of its satellitome. We focus on examining the phylogenetic relationships between satellites common to most species as well as characterizing satellite array structure and location in genome assemblies. We also analyse satellite abundance in each genome, focusing on differences in the satellite content between male and female individuals to look for satellites present in the female-specific W sex chromosome and the germline-restricted chromosome. Seven satDNA families shared by a quarter of the species were found, that were likely present in an ancestral species shared by most, if not all the species of Passeriformes. We observed that satDNA evolution is complex and does not follow species phylogeny and that satellite arrays generally have a simple head-to-tail conformation, with evidence in four of the sampled species of satDNA arrays with higher-order repeats. We also found two satDNA families with fairly consistent monomer length and conserved regions that we hypothesise to might be functional.
|
130 |
Pipeline for Next Generation Sequencing data of phage displayed libraries to support affinity ligand discoverySchleimann-Jensen, Ella January 2022 (has links)
Affinity ligands are important molecules used in affinity chromatography for purification of significant substances from complex mixtures. To find affinity ligands specific to important target molecules could be a challenging process. Cytiva uses the powerful phage display technique to find new promising affinity ligands. The phage display technique is a method run in several enrichment cycles. When developing new affinity ligands, a protein scaffold library with a diversity of up to 1010-1011 different protein scaffold variants is run through the enrichment cycles. The result from the phage display rounds is screened for target molecule binding followed by sequencing, usually with one of the conventional screening methods ELISA or Biacore followed by Sanger sequencing. However, the throughput of these analyses are unfortunately very low, often with only a few hundred screened clones. Therefore, Next Generation Sequencing or NGS, has become an increasingly popular screening method for phage display libraries which generates millions of sequences from each phage display round. This creates a need for a robust data analysis pipeline to be able to interpret the large amounts of data. In this project, a pipeline for analysis of NGS data of phage displayed libraries has been developed at Cytiva. Cytiva uses NGS as one of their screening methods of phage displayed protein libraries because of the high throughput compared to the conventional screening methods. The purpose is to find new affinity ligands for purification of essential substances used in drugs. The pipeline has been created using the object-oriented programming language R and consists of several analyses covering the most important steps to be able to find promising results from the NGS data. With the developed pipeline the user can analyze the data on both DNA and protein sequence level and per position residue breakdown, as well as filter the data based on specific amino acids and positions. This gives a robust and thorough analysis which can lead to promising results that can be used in the development of novel affinity ligands for future purification products.
|
Page generated in 0.2992 seconds