• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 149
  • 8
  • Tagged with
  • 157
  • 157
  • 157
  • 157
  • 157
  • 12
  • 11
  • 10
  • 10
  • 10
  • 10
  • 10
  • 9
  • 9
  • 9
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
131

Evaluating the biological relevance of disease consensus modules : An in silico study of IBD pathology using a bioinformatics approach

Ströbaek, Joel January 2019 (has links)
Inflammatory bowel disease encompasses a variety of heterogeneous chronic inflammatory diseases that affect the gastrointestinal tract, where Crohn’s disease and ulcerative colitis are the principal examples. The etiology of these, and many other complex human diseases, remain largely unknown and therefore pose relevant targets for novel research strategies. One such strategy is the in silico application of network theory derived methods to data sourced from publicly available repositories of e.g. gene expression data. Specifically, methods generating graphs of interconnected elements enriched by differentially expressed genes—disease modules—were inferred with data available through the Gene Expression Omnibus. Based on a previous method, the current project aimed to evaluate disease modules, combined from stand-alone inferential methods, in disease consensus modules: representing pathophenotypical motifs for the diseases of interest. The modules found to be significantly enriched by genome-wide association study inferred single-nucleotide polymorphisms, as validated using the Pathway Scoring Algorithm, were subsequently subjects for further analysis using Kyoto Encyclopedia of Genes and Genomes-pathway enrichment, and literature searches. The results of this study adheres to previous findings relating to the employed method, but lack any novelty pertaining the diseases of interest. However, the results substantiate the preceding methods’ conclusion by including parameters that increase statistical validity. In addition, the study contributed to peripheral results concerning both the methodology of consensus module methods, and the elucidation of inflammatory bowel disease etiology and disease subtype differentiation, that pose interesting subjects for future investigation.
132

Identification of personalized multi-omic disease modules in asthma

Martínez Enguita, David January 2018 (has links)
Asthma is a respiratory syndrome associated with airflow limitation, bronchial hyperresponsiveness and inflammation of the airways in the lungs. Despite the ongoing research efforts, the outstanding heterogeneity displayed by the multiple forms in which this condition presents often hampers the attempts to determine and classify the phenotypic and endotypic biological structures at play, even when considering a limited assembly of asthmatic subjects. To increase our understanding of the molecular mechanisms and functional pathways that govern asthma from a systems medicine perspective, a computational workflow focused on the identification of personalized transcriptomic modules from the U-BIOPRED study cohorts, by the use of the novel MODifieR integrated R package, was designed and applied. A feature selection of candidate asthma biomarkers was implemented, accompanied by the detection of differentially expressed genes across sample categories, the production of patient-specific gene modules and the subsequent construction of a set of core disease modules of asthma, which were validated with genomic data and analyzed for pathway and disease enrichment. The results indicate that the approach utilized is able to reveal the presence of components and signaling routes known to be crucially involved in asthma pathogenesis, while simultaneously uncovering candidate genes closely linked to the latter. The present project establishes a valuable pipeline for the module-driven study of asthma and other related conditions, which can provide new potential targets for therapeutic intervention and contribute to the development of individualized treatment strategies.
133

Proteus : A new predictor for protean segments

Söderquist, Fredrik January 2015 (has links)
The discovery of intrinsically disordered proteins has led to a paradigm shift in protein science. Many disordered proteins have regions that can transform from a disordered state to an ordered. Those regions are called protean segments. Many intrinsically disordered proteins are involved in diseases, including Alzheimer's disease, Parkinson's disease and Down's syndrome, which makes them prime targets for medical research. As protean segments often are the functional part of the proteins, it is of great importance to identify those regions. This report presents Proteus, a new predictor for protean segments. The predictor uses Random Forest (a decision tree ensemble classifier) and is trained on features derived from amino acid sequence and conservation data. Proteus compares favourably to state of the art predictors and performs better than the competition on all four metrics: precision, recall, F1 and MCC. The report also looks at the differences between protean and non-protean regions and how they differ between the two datasets that were used to train the predictor.
134

Model-Based Hypothesis Testing in Biomedicine : How Systems Biology Can Drive the Growth of Scientific Knowledge

Johansson, Rikard January 2017 (has links)
The utilization of mathematical tools within biology and medicine has traditionally been less widespread compared to other hard sciences, such as physics and chemistry. However, an increased need for tools such as data processing, bioinformatics, statistics, and mathematical modeling, have emerged due to advancements during the last decades. These advancements are partly due to the development of high-throughput experimental procedures and techniques, which produce ever increasing amounts of data. For all aspects of biology and medicine, these data reveal a high level of inter-connectivity between components, which operate on many levels of control, and with multiple feedbacks both between and within each level of control. However, the availability of these large-scale data is not synonymous to a detailed mechanistic understanding of the underlying system. Rather, a mechanistic understanding is gained first when we construct a hypothesis, and test its predictions experimentally. Identifying interesting predictions that are quantitative in nature, generally requires mathematical modeling. This, in turn, requires that the studied system can be formulated into a mathematical model, such as a series of ordinary differential equations, where different hypotheses can be expressed as precise mathematical expressions that influence the output of the model. Within specific sub-domains of biology, the utilization of mathematical models have had a long tradition, such as the modeling done on electrophysiology by Hodgkin and Huxley in the 1950s. However, it is only in recent years, with the arrival of the field known as systems biology that mathematical modeling has become more commonplace. The somewhat slow adaptation of mathematical modeling in biology is partly due to historical differences in training and terminology, as well as in a lack of awareness of showcases illustrating how modeling can make a difference, or even be required, for a correct analysis of the experimental data. In this work, I provide such showcases by demonstrating the universality and applicability of mathematical modeling and hypothesis testing in three disparate biological systems. In Paper II, we demonstrate how mathematical modeling is necessary for the correct interpretation and analysis of dominant negative inhibition data in insulin signaling in primary human adipocytes. In Paper III, we use modeling to determine transport rates across the nuclear membrane in yeast cells, and we show how this technique is superior to traditional curve-fitting methods. We also demonstrate the issue of population heterogeneity and the need to account for individual differences between cells and the population at large. In Paper IV, we use mathematical modeling to reject three hypotheses concerning the phenomenon of facilitation in pyramidal nerve cells in rats and mice. We also show how one surviving hypothesis can explain all data and adequately describe independent validation data. Finally, in Paper I, we develop a method for model selection and discrimination using parametric bootstrapping and the combination of several different empirical distributions of traditional statistical tests. We show how the empirical log-likelihood ratio test is the best combination of two tests and how this can be used, not only for model selection, but also for model discrimination. In conclusion, mathematical modeling is a valuable tool for analyzing data and testing biological hypotheses, regardless of the underlying biological system. Further development of modeling methods and applications are therefore important since these will in all likelihood play a crucial role in all future aspects of biology and medicine, especially in dealing with the burden of increasing amounts of data that is made available with new experimental techniques. / Användandet av matematiska verktyg har inom biologi och medicin traditionellt sett varit mindre utbredd jämfört med andra ämnen inom naturvetenskapen, såsom fysik och kemi. Ett ökat behov av verktyg som databehandling, bioinformatik, statistik och matematisk modellering har trätt fram tack vare framsteg under de senaste decennierna. Dessa framsteg är delvis ett resultat av utvecklingen av storskaliga datainsamlingstekniker. Inom alla områden av biologi och medicin så har dessa data avslöjat en hög nivå av interkonnektivitet mellan komponenter, verksamma på många kontrollnivåer och med flera återkopplingar både mellan och inom varje nivå av kontroll. Tillgång till storskaliga data är emellertid inte synonymt med en detaljerad mekanistisk förståelse för det underliggande systemet. Snarare uppnås en mekanisk förståelse först när vi bygger en hypotes vars prediktioner vi kan testa experimentellt. Att identifiera intressanta prediktioner som är av kvantitativ natur, kräver generellt sett matematisk modellering. Detta kräver i sin tur att det studerade systemet kan formuleras till en matematisk modell, såsom en serie ordinära differentialekvationer, där olika hypoteser kan uttryckas som precisa matematiska uttryck som påverkar modellens output. Inom vissa delområden av biologin har utnyttjandet av matematiska modeller haft en lång tradition, såsom den modellering gjord inom elektrofysiologi av Hodgkin och Huxley på 1950‑talet. Det är emellertid just på senare år, med ankomsten av fältet systembiologi, som matematisk modellering har blivit ett vanligt inslag. Den något långsamma adapteringen av matematisk modellering inom biologi är bl.a. grundad i historiska skillnader i träning och terminologi, samt brist på medvetenhet om exempel som illustrerar hur modellering kan göra skillnad och faktiskt ofta är ett krav för en korrekt analys av experimentella data. I detta arbete tillhandahåller jag sådana exempel och demonstrerar den matematiska modelleringens och hypotestestningens allmängiltighet och tillämpbarhet i tre olika biologiska system. I Arbete II visar vi hur matematisk modellering är nödvändig för en korrekt tolkning och analys av dominant-negativ-inhiberingsdata vid insulinsignalering i primära humana adipocyter. I Arbete III använder vi modellering för att bestämma transporthastigheter över cellkärnmembranet i jästceller, och vi visar hur denna teknik är överlägsen traditionella kurvpassningsmetoder. Vi demonstrerar också frågan om populationsheterogenitet och behovet av att ta hänsyn till individuella skillnader mellan celler och befolkningen som helhet. I Arbete IV använder vi matematisk modellering för att förkasta tre hypoteser om hur fenomenet facilitering uppstår i pyramidala nervceller hos råttor och möss. Vi visar också hur en överlevande hypotes kan beskriva all data, inklusive oberoende valideringsdata. Slutligen utvecklar vi i Arbete I en metod för modellselektion och modelldiskriminering med hjälp av parametrisk ”bootstrapping” samt kombinationen av olika empiriska fördelningar av traditionella statistiska tester. Vi visar hur det empiriska ”log-likelihood-ratio-testet” är den bästa kombinationen av två tester och hur testet är applicerbart, inte bara för modellselektion, utan också för modelldiskriminering. Sammanfattningsvis är matematisk modellering ett värdefullt verktyg för att analysera data och testa biologiska hypoteser, oavsett underliggande biologiskt system. Vidare utveckling av modelleringsmetoder och tillämpningar är därför viktigt eftersom dessa sannolikt kommer att spela en avgörande roll i framtiden för biologi och medicin, särskilt när det gäller att hantera belastningen från ökande datamängder som blir tillgänglig med nya experimentella tekniker.
135

ARG-MATEE Automated Pipeline for Detection of Antimicrobial Resistance in WGS Data Collected from Pig Farms and Surrounding Communities / Tracking Antimicrobial Resistance at Pig Farms

Halstead, Holly January 2020 (has links)
As part of recognizing the interconnected nature of different sectors in relation to health, AMR (antimicrobial resistance) has emerged as an issue of high global importance. E. coli isolates were taken from pig farms in Thailand, which serves as a point of interest in the study of ARGs (antimicrobial resistance genes) in emerging economies. The fecal samples were collected from pigs, humans who came in contact with the pigs, and humans who did not have contact with pigs to be analyzed for ARGS, virulence genes, and plasmids. Data was analyzed with an automated pipeline in the form of ARG-MATEE, the Antimicrobial Resistance Gene Multi-Analysis Tool for Enteric E. coli, a tool designed in this study to be used here and in future investigations. ARG-MATEE regulates and records internal software versions in a produced report which also includes data tables for all non phylogeny results in Boyce–Codd normal form and data visualizations for plasmids, ARGs, virulence genes, and phylogeny. Through the use of ARG-MATEE, the iss virulence gene was seen to be significantly different between testing groups as it is present in only human testing groups, suggesting the loss of function of the iss gene in pigs, showing host specialization.
136

Development of a DNA barcode for species identification of tuna

Nordquist, Clara, Edwall, Jonathan, Eriksson, Leonora, Mäkinen, Nelly, Sayehban, Minna, Styfberg, Matilda January 2022 (has links)
Today, DNA-barcoding with the gene COI is regularly used in the identification of fish. However, this is not an adequate way of identifying species of tuna due to COI lacking sufficient interspecies divergence. This is problematic since fraud and mislabeling are a major concern within the fish and tuna industries. Thus, there is a need for a new genetic barcode region when identifying the 15 tuna species within the tribe Thunnini. This study has considered six mitochondrial genetic regions (16S, ATP8, COII, CR, CytB, and ND2) and their potential as barcodes in comparison to COI. To be of practical use, the barcode has to be able to differentiate between all 15 tuna species, as well as contain conserved primer binding sites and be approximately 400 bp, or shorter. Analyses of the regions were made through Multiple Sequence Alignments built using ClustalW in Mega 11.0. The candidates were first evaluated through neighbor-joining trees and plots of inter- and intraspecies variation, and then analyzed further in search of conserved regions for primer binding, flanking a segment of approximately 400 bp (or shorter). This resulted in two possible barcode candidates with corresponding primers from the CR and ND2 genes. As a final step, these two were analyzed for specificity using BLAST, to evaluate their actual utility in differentiating the tuna species. The results show that they both can identify the different tuna species, but that ND2 is superior with 100% identification accuracy. In addition to the theoretical analysis, the ability of the primers was measured through a real PCR amplification. Unfortunately, only the CR barcode could be evaluated, but the results show it to be practically useful. Even though the utility of ND2 in PCR could not be analyzed, it is highly recommended as a region for further investigations. Given the strong theoretical support, it definitely shows promise as a new barcode for species identification of tuna.
137

Filtering of Clinical NGS Data to Improve Low Allele Frequency Variant Calling

Cumlin, Tomas January 2022 (has links)
Massive parallel sequencing (NGS) is useful in detecting and later classifying somatic driver mutations in cancer tumours. False-positive variants occur in the NGS workflow and they may be mistaken for low frequency somatic cancer mutations in a patient sample. This pushes the need for decreasing the noise rate in the NGS workflow since it may improve the detection of rare allele frequency variants, in particular cancer mutations. In this project, the aim was to reduce the level of false-positive variants in an NGS workflow. The scope was limited to looking at substitution errors and their neighbouring nucleotides. Alongside this, it was also a way to understand how different types of substitution errors are distributed in the data, if their frequencies are affected by neighbouring nucleotides and how data processing may affect these substitution rates. A bioinformatic pipeline was set up where a commercially available genomic DNA sample with known variants was subjected to different trimming and filtering settings. The goal was to reduce the substitution error rate as much as possible, without removing any true variants from the data. The optimised settings were trimming the sequencing reads with 5 bp from the tail and filtering sequencing reads that contained 5 or more substitutions. Three additional samples, whereof two were clinical and the third commercial, were tested with these settings. The results showed that in all samples, C:G>T:A substitutions were of a higher frequency compared to the rest of the substitution types. For all samples, A:T>C:G substitutions, where the neighbouring nucleotide was a C or a G on each side, had a higher frequency compared to A:T>C:G substitutions with other neighbouring nucleotides on both sides. Those substitution types were especially targeted by the trimming. For the two commercial samples, substitutions that resulted in the nucleotide combinations >XAA or >XTT were of a higher frequency compared to the same substitution types that did not result in those nucleotide combinations. Filtering reads with 5 or more substitutions particularly targeted these substitution types. Consequently, filtering had a greater effect on the commercial samples, compared to the clinical samples. Overall, trimming and filtering helped reduce transversions more than the transitions, increasing the transition/transversion ratio after processing the data. The results suggest that trimming and filtering can be a useful method to computationally reduce the transversion errors introduced in an NGS workflow, but transition errors to a lesser extent, in particular A:T>G:C transitions. To confirm these findings, more samples should be tested using this methodology. To better understand the effect of trimming and filtering on variant calling, the scope could in the future be expanded to also look at small insertions and deletions.
138

Statistical and machine learning methods to analyze large-scale mass spectrometry data

The, Matthew January 2016 (has links)
As in many other fields, biology is faced with enormous amounts ofdata that contains valuable information that is yet to be extracted. The field of proteomics, the study of proteins, has the luxury of having large repositories containing data from tandem mass-spectrometry experiments, readily accessible for everyone who is interested. At the same time, there is still a lot to discover about proteins as the main actors in cell processes and cell signaling. In this thesis, we explore several methods to extract more information from the available data using methods from statistics and machine learning. In particular, we introduce MaRaCluster, a new method for clustering mass spectra on large-scale datasets. This method uses statistical methods to assess similarity between mass spectra, followed by the conservative complete-linkage clustering algorithm.The combination of these two resulted in up to 40% more peptide identifications on its consensus spectra compared to the state of the art method. Second, we attempt to clarify and promote protein-level false discovery rates (FDRs). Frequently, studies fail to report protein-level FDRs even though the proteins are actually the entities of interest. We provided a framework in which to discuss protein-level FDRs in a systematic manner to open up the discussion and take away potential hesitance. We also benchmarked some scalable protein inference methods and included the best one in the Percolator package. Furthermore, we added functionality to the Percolator package to accommodate the analysis of studies in which many runs are aggregated. This reduced the run time for a recent study regarding a draft human proteome from almost a full day to just 10 minutes on a commodity computer, resulting in a list of proteins together with their corresponding protein-level FDRs. / <p>QC 20160412</p>
139

Characterisation of Potential Inhibitors of Calmodulin from Plasmodium falciparum

Iversen, Alexandra, Nordén, Ebba, Bjers, Julia, Wickström, Filippa, Zhou, Martin, Hassan, Mohamed January 2020 (has links)
Each year countless lives are affected and about half a million people die from malaria, a disease caused by parasites originating from the Plasmodium family. The most virulent species of the parasite is Plasmodium falciparum (P. falciparum).   Calmodulin (CaM) is a small, 148 amino acid long, highly preserved and essential protein in all eukaryotic cells. Previous studies have determined that CaM is important for the reproduction and invasion of P. falciparum in host cells. The primary structure of human CaM (CaMhum) and CaM from P. falciparum (CaMpf) differ in merely 16 positions, making differences in their structures and ligand affinity interesting to study. Especially since possible inhibitors of CaMpf in favor of CaMhum, in extension, could give rise to new malaria treatments.   Some antagonists, functioning as inhibitors of CaM, have already been analysed in previous studies. However, there are also compounds that have not yet been studied in regards to being possible antagonists of CaM. This study regards three known antagonists; trifluoperazine (TFP), calmidazolium (CMZ) and artemisinin (ART) and also three recently created fentanyl derivatives; 3-OH-4-OMe-cyclopropylfentanyl (ligand 1), 4-OH-3OMe-4F-isobutyrylfentanyl (ligand 2) and 3-OH-4-OMe-isobutyrylfentanyl (ligand 3).   Bioinformatic methods, such as modelling and docking, were used to compare the structures of CaMhum and CaMpf as well as observe the interaction of the six ligands to CaM from both species. In addition to the differences in primary structure, distinguished with ClustalW, disparities in tertiary structure were observed. Structure analysis of CaMhum and CaMpf in PyMOL disclosed a more open conformation as well as a larger, more defined, hydrophobic cleft in CaMhum compared to CaMpf. Simulated binding of the six ligands to CaM from both species, using Autodock 4.2, indicated that TFP and ART bind with higher affinity to CaMhum which is expected. Ligand 2 and ligand 3 also bound with higher affinity and facilitated stronger binding to CaMhum, which is reasonable since their docking is based on how TFP binds to CaM. However, ligand 1 as well as CMZ both bound to CaMpf with higher affinity. Despite promising results for ligand 1 and CMZ, no decisive conclusion can be made solely based on bioinformatic studies.    To gain a better understanding on the protein-ligand interactions of the six ligands to CaMhum and CaMpf, further studies using e.g. circular dichroism and fluorescence would be advantageous. Based on the results from this study, future studies on the binding of CMZ and ligand 1 to CaM as well as ligands with similar characteristics would be especially valuable. This is because they, based on the results from this study, possibly are better inhibitors of CaMpf than CaMhum and thereby could function as possible antimalarial drugs.
140

Computational prediction of cell-cell interactions in the brain-tumour microenvironment

Camargo Romera, Paula January 2023 (has links)
Glioblastoma is the fastest-growing, and the most common malignant brain tumour in adults. It is normally treated with surgery and radio- or chemotherapy, but the approximate life expectancy is of 15 months with a high probability of cancer recurring. Therefore, there is a need for decreasing its severity. Bulk and single-cell RNA sequencing allow the identification of cellular states in tumours affected by cell-intrinsic and extrinsic factors. Four different cellular states have been identified in glioblastoma: neural progenitor-like, oligodendrocyte progenitor-like, astrocyte-like, and mesenchymal-like. As glioblastoma is an immunosuppressive tumour, it can alter the immune system and increase the tumour's immune escaping by segregating immunosuppressive factors or interacting with the brain microenvironment.Two datasets were used in this study to explore if the localization of the tumour in the brain microenvironment and the tendency of glioblastomas to activate microglial cells are due to particular ligand-receptor interactions. Data quality control was applied to both datasets and SingleCellSignalR and CellphoneDB packages were used to predict the possible interactions. A total of seven experiments were designed for this study. The first dataset, GBmap, allowed us to do a comparison between tumour cells and microglia, tumour cells and other cell types in the brain, and the four cellular states of glioblastoma with microglia and macrophages. Next, healthy microglia from GBmap was used to compare with the tumour bulk data from the second dataset, HGCC. The bootstrap technique was performed to compare bulk data vs single-cell data, and a comparison between tumour cells and microglia or other cell types was analysed.Results showed specific and shared interactions between cell types or cellular states, revealing the different localization of the tumour cells depends on the expressed ligand-receptor pairs. Also, a total of four patterns of interactions were found in the 50 samples to have a different tendency to activate microglial cells, which are promising results to further explore drugs to interfere with or how these interactions are related to patient survival. Furthermore, even if glioblastoma is a heterogenous disease, more interactions were predicted with microglial/macrophage cells without a uniform pattern between patients, and therefore, this study is a starting point upon which further in vitro studies would be needed to study the predicted interactions as potential targets to stop the progression of this type of cancer.

Page generated in 0.0691 seconds