Spelling suggestions: "subject:"bioinformatics anda lemsystems biology"" "subject:"bioinformatics anda lemsystems ciology""
141 |
ARG-MATEE Automated Pipeline for Detection of Antimicrobial Resistance in WGS Data Collected from Pig Farms and Surrounding Communities / Tracking Antimicrobial Resistance at Pig FarmsHalstead, Holly January 2020 (has links)
As part of recognizing the interconnected nature of different sectors in relation to health, AMR (antimicrobial resistance) has emerged as an issue of high global importance. E. coli isolates were taken from pig farms in Thailand, which serves as a point of interest in the study of ARGs (antimicrobial resistance genes) in emerging economies. The fecal samples were collected from pigs, humans who came in contact with the pigs, and humans who did not have contact with pigs to be analyzed for ARGS, virulence genes, and plasmids. Data was analyzed with an automated pipeline in the form of ARG-MATEE, the Antimicrobial Resistance Gene Multi-Analysis Tool for Enteric E. coli, a tool designed in this study to be used here and in future investigations. ARG-MATEE regulates and records internal software versions in a produced report which also includes data tables for all non phylogeny results in Boyce–Codd normal form and data visualizations for plasmids, ARGs, virulence genes, and phylogeny. Through the use of ARG-MATEE, the iss virulence gene was seen to be significantly different between testing groups as it is present in only human testing groups, suggesting the loss of function of the iss gene in pigs, showing host specialization.
|
142 |
Development of a DNA barcode for species identification of tunaNordquist, Clara, Edwall, Jonathan, Eriksson, Leonora, Mäkinen, Nelly, Sayehban, Minna, Styfberg, Matilda January 2022 (has links)
Today, DNA-barcoding with the gene COI is regularly used in the identification of fish. However, this is not an adequate way of identifying species of tuna due to COI lacking sufficient interspecies divergence. This is problematic since fraud and mislabeling are a major concern within the fish and tuna industries. Thus, there is a need for a new genetic barcode region when identifying the 15 tuna species within the tribe Thunnini. This study has considered six mitochondrial genetic regions (16S, ATP8, COII, CR, CytB, and ND2) and their potential as barcodes in comparison to COI. To be of practical use, the barcode has to be able to differentiate between all 15 tuna species, as well as contain conserved primer binding sites and be approximately 400 bp, or shorter. Analyses of the regions were made through Multiple Sequence Alignments built using ClustalW in Mega 11.0. The candidates were first evaluated through neighbor-joining trees and plots of inter- and intraspecies variation, and then analyzed further in search of conserved regions for primer binding, flanking a segment of approximately 400 bp (or shorter). This resulted in two possible barcode candidates with corresponding primers from the CR and ND2 genes. As a final step, these two were analyzed for specificity using BLAST, to evaluate their actual utility in differentiating the tuna species. The results show that they both can identify the different tuna species, but that ND2 is superior with 100% identification accuracy. In addition to the theoretical analysis, the ability of the primers was measured through a real PCR amplification. Unfortunately, only the CR barcode could be evaluated, but the results show it to be practically useful. Even though the utility of ND2 in PCR could not be analyzed, it is highly recommended as a region for further investigations. Given the strong theoretical support, it definitely shows promise as a new barcode for species identification of tuna.
|
143 |
Filtering of Clinical NGS Data to Improve Low Allele Frequency Variant CallingCumlin, Tomas January 2022 (has links)
Massive parallel sequencing (NGS) is useful in detecting and later classifying somatic driver mutations in cancer tumours. False-positive variants occur in the NGS workflow and they may be mistaken for low frequency somatic cancer mutations in a patient sample. This pushes the need for decreasing the noise rate in the NGS workflow since it may improve the detection of rare allele frequency variants, in particular cancer mutations. In this project, the aim was to reduce the level of false-positive variants in an NGS workflow. The scope was limited to looking at substitution errors and their neighbouring nucleotides. Alongside this, it was also a way to understand how different types of substitution errors are distributed in the data, if their frequencies are affected by neighbouring nucleotides and how data processing may affect these substitution rates. A bioinformatic pipeline was set up where a commercially available genomic DNA sample with known variants was subjected to different trimming and filtering settings. The goal was to reduce the substitution error rate as much as possible, without removing any true variants from the data. The optimised settings were trimming the sequencing reads with 5 bp from the tail and filtering sequencing reads that contained 5 or more substitutions. Three additional samples, whereof two were clinical and the third commercial, were tested with these settings. The results showed that in all samples, C:G>T:A substitutions were of a higher frequency compared to the rest of the substitution types. For all samples, A:T>C:G substitutions, where the neighbouring nucleotide was a C or a G on each side, had a higher frequency compared to A:T>C:G substitutions with other neighbouring nucleotides on both sides. Those substitution types were especially targeted by the trimming. For the two commercial samples, substitutions that resulted in the nucleotide combinations >XAA or >XTT were of a higher frequency compared to the same substitution types that did not result in those nucleotide combinations. Filtering reads with 5 or more substitutions particularly targeted these substitution types. Consequently, filtering had a greater effect on the commercial samples, compared to the clinical samples. Overall, trimming and filtering helped reduce transversions more than the transitions, increasing the transition/transversion ratio after processing the data. The results suggest that trimming and filtering can be a useful method to computationally reduce the transversion errors introduced in an NGS workflow, but transition errors to a lesser extent, in particular A:T>G:C transitions. To confirm these findings, more samples should be tested using this methodology. To better understand the effect of trimming and filtering on variant calling, the scope could in the future be expanded to also look at small insertions and deletions.
|
144 |
Statistical and machine learning methods to analyze large-scale mass spectrometry dataThe, Matthew January 2016 (has links)
As in many other fields, biology is faced with enormous amounts ofdata that contains valuable information that is yet to be extracted. The field of proteomics, the study of proteins, has the luxury of having large repositories containing data from tandem mass-spectrometry experiments, readily accessible for everyone who is interested. At the same time, there is still a lot to discover about proteins as the main actors in cell processes and cell signaling. In this thesis, we explore several methods to extract more information from the available data using methods from statistics and machine learning. In particular, we introduce MaRaCluster, a new method for clustering mass spectra on large-scale datasets. This method uses statistical methods to assess similarity between mass spectra, followed by the conservative complete-linkage clustering algorithm.The combination of these two resulted in up to 40% more peptide identifications on its consensus spectra compared to the state of the art method. Second, we attempt to clarify and promote protein-level false discovery rates (FDRs). Frequently, studies fail to report protein-level FDRs even though the proteins are actually the entities of interest. We provided a framework in which to discuss protein-level FDRs in a systematic manner to open up the discussion and take away potential hesitance. We also benchmarked some scalable protein inference methods and included the best one in the Percolator package. Furthermore, we added functionality to the Percolator package to accommodate the analysis of studies in which many runs are aggregated. This reduced the run time for a recent study regarding a draft human proteome from almost a full day to just 10 minutes on a commodity computer, resulting in a list of proteins together with their corresponding protein-level FDRs. / <p>QC 20160412</p>
|
145 |
Characterisation of Potential Inhibitors of Calmodulin from Plasmodium falciparumIversen, Alexandra, Nordén, Ebba, Bjers, Julia, Wickström, Filippa, Zhou, Martin, Hassan, Mohamed January 2020 (has links)
Each year countless lives are affected and about half a million people die from malaria, a disease caused by parasites originating from the Plasmodium family. The most virulent species of the parasite is Plasmodium falciparum (P. falciparum). Calmodulin (CaM) is a small, 148 amino acid long, highly preserved and essential protein in all eukaryotic cells. Previous studies have determined that CaM is important for the reproduction and invasion of P. falciparum in host cells. The primary structure of human CaM (CaMhum) and CaM from P. falciparum (CaMpf) differ in merely 16 positions, making differences in their structures and ligand affinity interesting to study. Especially since possible inhibitors of CaMpf in favor of CaMhum, in extension, could give rise to new malaria treatments. Some antagonists, functioning as inhibitors of CaM, have already been analysed in previous studies. However, there are also compounds that have not yet been studied in regards to being possible antagonists of CaM. This study regards three known antagonists; trifluoperazine (TFP), calmidazolium (CMZ) and artemisinin (ART) and also three recently created fentanyl derivatives; 3-OH-4-OMe-cyclopropylfentanyl (ligand 1), 4-OH-3OMe-4F-isobutyrylfentanyl (ligand 2) and 3-OH-4-OMe-isobutyrylfentanyl (ligand 3). Bioinformatic methods, such as modelling and docking, were used to compare the structures of CaMhum and CaMpf as well as observe the interaction of the six ligands to CaM from both species. In addition to the differences in primary structure, distinguished with ClustalW, disparities in tertiary structure were observed. Structure analysis of CaMhum and CaMpf in PyMOL disclosed a more open conformation as well as a larger, more defined, hydrophobic cleft in CaMhum compared to CaMpf. Simulated binding of the six ligands to CaM from both species, using Autodock 4.2, indicated that TFP and ART bind with higher affinity to CaMhum which is expected. Ligand 2 and ligand 3 also bound with higher affinity and facilitated stronger binding to CaMhum, which is reasonable since their docking is based on how TFP binds to CaM. However, ligand 1 as well as CMZ both bound to CaMpf with higher affinity. Despite promising results for ligand 1 and CMZ, no decisive conclusion can be made solely based on bioinformatic studies. To gain a better understanding on the protein-ligand interactions of the six ligands to CaMhum and CaMpf, further studies using e.g. circular dichroism and fluorescence would be advantageous. Based on the results from this study, future studies on the binding of CMZ and ligand 1 to CaM as well as ligands with similar characteristics would be especially valuable. This is because they, based on the results from this study, possibly are better inhibitors of CaMpf than CaMhum and thereby could function as possible antimalarial drugs.
|
146 |
Computational prediction of cell-cell interactions in the brain-tumour microenvironmentCamargo Romera, Paula January 2023 (has links)
Glioblastoma is the fastest-growing, and the most common malignant brain tumour in adults. It is normally treated with surgery and radio- or chemotherapy, but the approximate life expectancy is of 15 months with a high probability of cancer recurring. Therefore, there is a need for decreasing its severity. Bulk and single-cell RNA sequencing allow the identification of cellular states in tumours affected by cell-intrinsic and extrinsic factors. Four different cellular states have been identified in glioblastoma: neural progenitor-like, oligodendrocyte progenitor-like, astrocyte-like, and mesenchymal-like. As glioblastoma is an immunosuppressive tumour, it can alter the immune system and increase the tumour's immune escaping by segregating immunosuppressive factors or interacting with the brain microenvironment.Two datasets were used in this study to explore if the localization of the tumour in the brain microenvironment and the tendency of glioblastomas to activate microglial cells are due to particular ligand-receptor interactions. Data quality control was applied to both datasets and SingleCellSignalR and CellphoneDB packages were used to predict the possible interactions. A total of seven experiments were designed for this study. The first dataset, GBmap, allowed us to do a comparison between tumour cells and microglia, tumour cells and other cell types in the brain, and the four cellular states of glioblastoma with microglia and macrophages. Next, healthy microglia from GBmap was used to compare with the tumour bulk data from the second dataset, HGCC. The bootstrap technique was performed to compare bulk data vs single-cell data, and a comparison between tumour cells and microglia or other cell types was analysed.Results showed specific and shared interactions between cell types or cellular states, revealing the different localization of the tumour cells depends on the expressed ligand-receptor pairs. Also, a total of four patterns of interactions were found in the 50 samples to have a different tendency to activate microglial cells, which are promising results to further explore drugs to interfere with or how these interactions are related to patient survival. Furthermore, even if glioblastoma is a heterogenous disease, more interactions were predicted with microglial/macrophage cells without a uniform pattern between patients, and therefore, this study is a starting point upon which further in vitro studies would be needed to study the predicted interactions as potential targets to stop the progression of this type of cancer.
|
147 |
Identifying structural variants from plant short-read sequencing dataBuinovskaja, Greta January 2022 (has links)
No description available.
|
148 |
Isolation of the native chloroplast proteome from plant for identification of protein-metabolite interactions / Isolering av det nativa kloroplastproteomet från planta i syfte att identifiera protein-metabolitinteraktionerStrandberg, Linnéa January 2021 (has links)
För att kunna livnära en växande population behöver avkastningen på skördar öka. En lösning på dettaär att optimera plantornas fotosyntes, vilket innefattar förbättrad koldioxidfixering. För att lyckas meddet krävs kunskap i hur reglering av nyckelproteiner i kloroplasten går till. Syftet med detta projekt är identifiera möjliga reglerande protein-metabolitinteraktioner i Arabidopsis thaliana. Målproteinerna ärde 11 enzymerna i Calvin-Benson-Basshamcykeln. Metaboliterna som testas är 3PGA, ATP, FBP, GAP, vilka är mellan produkter eller kofaktorer i cykeln; 2PG, som är en produkt av en konkurrerande reaktion i cykeln; och slutligen G6P, citrat och sackaros, vilka är centrala metaboliter i andra viktiga reaktioner i cellen. Före experimenten med Arabidopsis testades protokollen med spenat. Som ett första steg isolerades kloroplasterna från blad. När intakta kloroplaster verifierats extraherades proteinerna. Inter-aktioner mellan metaboliterna och proteinerna analyserades med en metod kallad limited proteolysis-small molecule mapping. Denna teknik, vilken kombinerar begränsad proteolys med masspektrometri, detekterade flertalet protein-metabolit interaktioner. I Arabidopsis uppvisade alla enzym förutom FB-Pase, PPE och TIM minst en interaktion. I spenat sågs interaktioner med FBA, GAPDH, PGK, PRK, RuBisCO, TIM och TK. Resultaten visar möjliga reglerande interaktioner, vilka skulle kunna användasför att identifiera flaskhalsar i kolfixeringen. Denna kunskap kan i sin tur utnyttjas för att öka flödet i Calvin-Benson-Basshamcykeln och därigenom förbättra växters koldioxidfixering. / In order to feed a growing population, the crop yield needs to be increased. One way to do this is to optimise the photosynthetic activity in the plant, which includes improvement of carbon fixation. To succeed with this, knowledge of the regulation of key proteins in the chloroplast is required. The aim of this project is to identify possible regulatory protein-metabolite interactions in chloroplasts from Arabidopsis thaliana. The target proteins are the 11 enzymes of the Calvin-Benson-Bassham cycle. The metabolites of interest are 3PGA, ATP, FBP, GAP, which are intermediates or co-factors of the cycle;2PG, which is a product of a competing reaction in the cycle; and finally G6P, citrate and sucrose, which are central metabolites in other vital reactions in the cell. Before the experiments with Arabidopsis, spinach was used as a test organism to evaluate the proposed protocols. First, chloroplasts were isolatedfrom leaves. When the integrity of the chloroplasts had been validated, the proteins were extracted. Metabolic interactions with the extracted proteins were analyzed with limited proteolysis-small molecule mapping. This method, which combines limited proteolysis with mass spectrometry, detected severalprotein-metabolite interactions. In Arabidopsis, all enzymes except for FBPase, PPE and TIM had atleast one interaction. In spinach, interactions were seen with FBA, GAPDH, PGK, PRK, RuBisCO,TIM and TK. The results highlight potential regulatory events, which could be used to target bottlenecks in carbon fixation. This could provide a pathway to increase the flux in the Calvin-Benson-Bassham cycle, and thereby improve carbon fixation in plants.
|
149 |
Epidemiological and statistical basis for detection and prediction of influenza epidemicsSpreco, Armin January 2017 (has links)
A large number of emerging infectious diseases (including influenza epidemics) has been identified during the last century. The emergence and re-emergence of infectious diseases have a negative impact on global health. Influenza epidemics alone cause between 3 and 5 million cases of severe illness annually, and between 250,000 and 500,000 deaths. In addition to the human suffering, influenza epidemics also impose heavy demands on the health care system. For example, hospitals and intensive care units have limited excess capacity during infectious diseases epidemics. Therefore, it is important that increased influenza activity is noticed early at local levels to allow time to adjust primary care and hospital resources that are already under pressure. Algorithms for the detection and prediction of influenza epidemics are essential components to achieve this. Although a large number of studies have reported algorithms for detection or prediction of influenza epidemics, outputs that fulfil standard criteria for operational readiness are seldom produced. Furthermore, in the light of the rapidly growing availability of “Big Data” from both diagnostic and prediagnostic (syndromic) data sources in health care and public health settings, a new generation of epidemiologic and statistical methods, using several data sources, is desired for reliable analyses and modeling. The rationale for this thesis was to inform the planning of local response measures and adjustments to health care capacity during influenza epidemics. The overall aim was to develop a method for detection and prediction of influenza epidemics. Before developing the method, three preparatory studies were performed. In the first of these studies, the associations (in terms of correlation) between diagnostic and pre-diagnostic data sources were examined, with the aim of investigating the potential of these sources for use in influenza surveillance systems. In the second study, a literature study of detection and prediction algorithms used in the field of influenza surveillance was performed. In the third study, the algorithms found in the previous study were compared in a prospective evaluation study. In the fourth study, a method for nowcasting of influenza activity was developed using electronically available data for real-time surveillance in local settings followed by retrospective application on the same data. This method includes three functions: detection of the start of the epidemic at the local level and predictions of the peak timing and the peak intensity. In the fifth and final study, the nowcasting method was evaluated by prospective application on authentic data from Östergötland County, Sweden. In the first study, correlations with large effect sizes between diagnostic and pre-diagnostic data were found, indicating that pre-diagnostic data sources have potential for use in influenza surveillance systems. However, it was concluded that further longitudinal research incorporating prospective evaluations is required before these sources can be used for this purpose. In the second study, a meta-narrative review approach was used in which two narratives for reporting prospective evaluation of influenza detection and prediction algorithms were identified: the biodefence informatics narrative and the health policy research narrative. As a result of the promising performances of one detection algorithm and one prediction algorithm in the third study, it was concluded that both further evaluation research and research on methods for nowcasting of influenza activity were warranted. In the fourth study, the performance of the nowcasting method was promising when applied on retrospective data but it was concluded that thorough prospective evaluations are necessary before recommending the method for broader use. In the fifth study, the performance of the nowcasting method was promising when prospectively applied on authentic data, implying that the method has potential for routine use. In future studies, the validity of the nowcasting method must be investigated by application and further evaluation in multiple local settings, including large urbanizations.
|
150 |
Introducing quality assessment and efficient management of cellular thermal shift assay mass spectrometry dataHellner, Joakim January 2017 (has links)
Recent advances in molecular biology has led to the discovery of many new potential drugs. However, difficulties with in situ analysis of ligand binding prevents quick advancement in clinical trials, which stresses the need for better direct methods. A relatively new methodology, called Cellular Thermal Shift Assay (CETSA), allows for detection of ligand binding in a cells natural environment and can be used in combination with Mass Spectrometry (MS) for readout. With help from the Pelago Bioscience team, I developed a pipeline for processing of CETSA MS data and a web based system for viewing the results. The system, called CETSA Analytics, also evaluates the results relevance and helps its users to locate information efficiently. CETSA Analytics is currently being tested by Pelago Bioscience AB as a tool for experimental data distribution.
|
Page generated in 0.1164 seconds