• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 562
  • 24
  • 19
  • 8
  • 2
  • Tagged with
  • 615
  • 532
  • 341
  • 188
  • 167
  • 165
  • 152
  • 88
  • 58
  • 58
  • 52
  • 49
  • 49
  • 48
  • 46
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
151

Comparison of quality performance of whole genome sequencing analysis pipelines for foodborne pathogens

Ramsin, Chelsea January 2022 (has links)
Campylobacter is the leading cause of gastroenteritis worldwide and in Sweden there areofficial programs for the surveillance of the bacteria. One important objective with foodbornepathogen surveillance is molecular typing. As typing based on whole genome sequencing datais becoming more common, knowledge on how to set up analysis pipelines is essential to avoidvariation in results. Here, typical whole genome sequencing pipelines are compared to areference genome at different analysis stages to optimize assembly quality and typing resultsusing cgMLST. The results show that read trimming is optimal to obtain high quality assemblieswith SPAdes as well as for improving cgMLST results compared to when no read trimming wasperformed before assembling with SPAdes. The opposite was shown for SKESA wheretrimming beforehand had negative effects on the results, most likely due to SKESA having builtin trimming properties. Additionally post assembly improvements had generally positive effects,however these effects were small.Tekni
152

Multivariate analysis and classification of pathogenic priming components in wild-type and lab mice

Stoe, Armand January 2023 (has links)
Animal models have a long history of being used in research for the purpose of investigating biological processes and testing the effect of specific compounds on the functionality of biological processes. Different types of mice are used as animal models, most notably inbred and outbred strains. This study investigates the effect of certain priming conditions on the production of cytokines in wild mice and lab mice, using multivariate data analysis. This analytical study involves exploratory analysis, in the form of PCA, MANOVA and LDA, training of different classification models and their validation. Based on the conducted exploratory analysis, certain priming conditions (CD3CD28, CPG and PG) have been identified as clearly defined groups by PCA and LDA, in both wild mice and lab mice. MANOVA concluded that most of the variables tested are statistically significant in determining group association. Subsequent classification modeling determined that the Random Forest algorithm is the most accurate in predicting class, in both the wild and lab mice. The performed analysis has given insight into the major trends exhibited by the data, but further post-processing analysis could potentially extract more data. The results of this study could be used to further investigate the discovered pattern in the data or be supplemented by comparing additional mouse strains under the same experimental conditions.
153

Influence of Preprocessing Steps for Molecular Data on Deep Neural Network Performance

Malla, Tajouj January 2023 (has links)
The massive accumulation of omics data requires effective computational tools to analyze and interpret such data. Deep learning (DL), a branch of machine learning (ML) and artificial intelligence (AI), has shed light on these challengings and achieved great success in bioinformatics. However, the influence of preprocessing steps on DL model’s performance remains a critical aspect that requires thorough investigation. This study aims to investigates the effects of different combinations of preprocessing techniques and feature selection methods on the predictive performance of deep neural networks (DNN) on supervised tasks. For this purpose, four normalization methods, one transformation method, and two feature scaling methods were applied, in addition to two feature selection methods. This comprehensive analysis resulted in a total of 28 unique combinations, each representing a unique classifier. The experimental analysis was conducted using gene expression profiles from multiple cancer datasets. The result highlights the significance of preprocessing step in achieving optimal DNN performance, with notable variations observed across different datasets and preprocessing techniques. We identify a specific preprocessing workflow that improve DNN performance, and certain preprocessing choices that may lead to suboptimal model performance. In addition, we identify potential pitfalls and challenges associated with the data structure and class imbalance. This study contributes to the understanding of the effect of pre-processing steps and provides insights into which pre-processing steps work best and hence, improve the overall performance of DNN model and enables the development of more robust and accurate models.
154

Evaluation and implementation of quality control parameters for genome-wide DNA methylome sequencing

Ekberg, Sara January 2022 (has links)
Epigenomics is the study of modifications to the genetic material without changes to the DNA sequence, one such modification is methylation of nucleotides. DNA methylation is associated with gene regulation and is studied in a variety of fields such as cancer and ageing. Quality control is essential when designing research studies to ensure that the end result is not affected by poor quality data. In this study, the aim was to define robust quality parameters for whole methylome sequencing for Illumina next generation sequencing data. Three different library preparation protocols, all designed for methylation analysis, has been compared: Accel-NGS Methyl-Seq DNA library, NEB Next Enzymatic Methyl-seq and SPlinted Ligation Adapter Tagging. All samples were sequenced on the Illumina NovaSeq 6000 with paired-end 150 bp. An evaluation of alignment software was also included in the study. The nf-core methylseq pipeline version 1.6.1 was used to process all samples in the study. The pipeline was run multiple times with different settings depending on library type and software choice. Throughout the study, the parameters puc19, lambda and alignment rate showed consistency whereas overall methylation rate and coverage were affected by origin of sample material and study design. In conclusion, not all proposed quality parameters were suitable for general quality control since study design and origin of sample material have impact, but alignment rate and the controls puc19 and lambda shows great promise for general quality control. Future work to establish sample material specific thresholds for methylation rate is encouraged.
155

Capturing genes with high impact based on reconstruction errors produced by variational autoencoders

Rieger, Utz Lovis January 2023 (has links)
In this work we present a novel method to extract potential hub genes, transcription factors and regions with densely interconnected protein-protein-interaction networks from RNAseq data. To achieve this we deploy variational autoencoders, a generative machine learning framework, and extract the gene-wise reconstruction errors. This reconstruction error produced during training is considered as a measurement of impact for a gene on the transcriptome here.  The method can handle big datasets (3.5Gb and more) in reasonable time on computers for domestic usage without any gpu-acceleration. This circumstance allows users without access to large amounts of computational resources to also work with expression data of large size.  The final ranking based on reconstruction errors underlies less of a bias compared to most hub gene inference methods currently available. Also no prior gene regulatory network inference is required. However, the introduction of a bias can help to focus on certain genes of interest. Here we biased by using genes present in the STRING data base to also ease the following analysis.  Analysis of reconstruction error showed a tendency for genes with low reconstruction error to capture genes with central meaning to the data set used for training. In case of healthy cells this was genes associated with house keeping mechanisms and for breast cancer data those genes were associated to breast cancer. In breast cancer specific data we found for example a high frequency of HOX family members linked specifically to breast cancer. For data covering different types of cancer here the picture was broader and covered a wide range of genes associated with different types of cancer.  There also was a high enrichment of transcription factors present in the genes with low reconstruction error. Not only the regions with lowest reconstruction error will reveal a high enrichment for transcription factors, also other regions show transcription factor enrichment. Transcription factors from these other regions will differ regarding their correlation patterns.  Regions with low reconstruction error and/or a high transcription factor enrichment show a high PPI-enrichment and exhibit densely interconnected networks.
156

The battle against sepsis : exploring the genotypic diversity of pseudomonas and proteus clinical isolates

Ahmed, Suud January 2023 (has links)
Sepsis is a dangerous and potentially fatal condition that has a mysterious origin, underscoring the significance of prompt and accurate diagnosis and treatment. Bacterial whole-genome sequencing, which is widely used in clinical microbiology, stands at the forefront of sequencing technologies, particularly to combat sepsis. The aim of this thesis is to improve sepsis treatment by examining the genetic characteristics and drug resistance patterns of the common sepsis-causing bacteria Pseudomonas and Proteus spp., by analyzing the whole-genome sequencing data of bacterial isolates using an in-house-developed pipeline. The result was compared with a commercial cloud-based platform from 1928 Diagnostic (Gothenburg, Sweden), as well as the results from a clinical laboratory. Using Illumina HiSeq X next-generation sequencing technology, whole-genome data from 88 isolates of Pseudomonas and Proteus spp. was obtained. The isolates were obtained during a prospective observational study of community-onset severe sepsis and septic shock in adults at Skaraborg Hospital in Sweden's western region. The collected isolates were characterized using approved laboratory techniques, such as phenotypic antibiotic susceptibility testing (AST) in accordance with EUCAST guidelines and species identification by MALDI-TOF MS analysis. The species identification result matched the phenotypic method, with the exception of two isolates from Pseudomonas samples and four isolates from Proteus samples. When benchmarking the in-house pipeline and 1928 platform for Pseudomonas spp., predicted 97% of the isolates were resistant to at least one class of the tested antibiotics, of which 94% shows multi-drug resistance. In phenotypes, 88% of the isolates had at least one antibiotic resistance future, of which 68% shows multi-drug resistance. The most prevalent sequence types (STs) identified were ST 3285 and ST111 (9.3%) and ST564 and ST17 (6.98%) each, and both pipelines accurately predicted the number of multilocus types. The in-house pipeline reported 9820 Pseudomonas virulence genes, with PhzB1, a metabolic factor, being the most common gene. It was discovered that there was a significant correlation between the virulence factor gene count and the multilocus sequence typing (MLST) (p = 0.00001). With a Simpson's Diversity Index of 0.98, the urine culture specimens showed the greatest ST diversity. Plasmids were detected in twelve samples (20.93%) in total. In general, this study provided a detailed description of the bacterial future for Pseudomonas and Proteus organisms using WGS data. This research shows the applicability of the in-house and 1928 pipelines in the identification of sepsis-causing organisms with accuracy. It also showed the need for an organized and easy-to-use international pipeline to implement and analyze WGS bacterial data and to compare it with laboratory results as needed.
157

Implementing ExomeDepth with a variant filter as a CNV-calling tool for germline clinical diagnostic testing

Krysén, Alice January 2022 (has links)
Copy number variations (CNVs) cover approximately 4.9 - 9.5% of the human genome. CNVs are involved in both the development of disease and evolutionary adaptions. CNVs play an important part in the development and progression of multiple cardiovascular diseases. CNV calling is traditionally performed with cromosomal microarray (CMA). The advantage of next generation sequensing (NGS) instead of CMA include higher resolution, lower cost and higher sensitivity in detecting smaller CNVs. CNV calling with NGS is connected to a high number of false positives. In this study three different CNV-calling tools for clinical exome sequencing data were evaluated; CoNIFER, CONTRA and ExomeDepth. To further decrease the false positive rate and decrease the hands-on analysis time a variant filter for ExomeDepth was developed and evaluated. However, CNV-calling with clinical exome data is still challenging due to low coverage.
158

Guld och gröna skogar : En analys av värdefull ekmiljö i Stockholms län / Old but gold : Excavating hidden treasures of the oak in Stockholm County

Pakka, Camilla January 2023 (has links)
Ek (Quercus spp.) är nationellt av stor betydelse för biologisk mångfald, arterna utgör centrala och grundläggande komponenter i lokala ekosystem. Stockholms län är en av Sveriges mest tätbefolkade regioner där en snabb förtätning av urbana miljöer och hög exploateringstakt utgör ett reellt hot mot värdefulla ekmiljöer. Syftet med studien är att undersöka vad som definierar värdefulla ekmiljöer och genom GIS-analys identifiera var i Stockholms län dessa återfinns, samt om det finns ett samband mellan parametrar kopplade till ekmiljöer, biologisk mångfald och storlek på värdetrakter. Via litteraturstudier noterades 18 parametrar som lämpliga för identifiering och rankning av värdefulla ekmiljöer. Ett system utvecklades för poängsättning av ekars attribut som ett indirekt mått på biologisk mångfald. Systemet bidrar med en relativt lättillgänglig identifiering av värdefulla ekmiljöer i tre skalor; värdeelement, värdekärna och värdetrakt. Klusteranalyser baserat på fyra spridningsavstånd, 119, 200, 477 och 656 meter utfördes för att lokalisera värdekärnor. Via buffertanalys baserat på spridningsavståndet två kilometer identifierades 20 värdetrakter i Stockholms län. En rankning av de 172 värdekärnorna samt de 20 värdetrakterna genomfördes baserat på ekarnas poängmedelvärde. För att undersöka om det fanns ett positivt samband mellan värdekärnors alternativt värdetrakters area och biologisk mångfald utfördes Spermans korrelationsanalys, som visade att det inte fanns något positivt samband mellan variablerna. Studiens resultat tyder på att höga naturvärden inte är synonymt med stora värdekärnor alternativt stora värdetrakter, utan att även mindre värdekärnor och värdetrakter kan vara mycket värdefulla. / Oak (Quercus spp.) is of great national importance for biodiversity, as central and fundamental components of local ecosystems. Stockholm County is one of Sweden's most densely populated regions where rapid urban densification and high rates of exploitation pose a real threat to valuable oak habitats. The purpose of the study is to investigate variables that define valuable oak habitats, and through GIS analysis locate their distribution in Stockholm County. Furthermore, the study examines whether there is a positive relationship between parameters associated with oak habitats, biodiversity, and the size of value areas. Through literature review, 18 parameters suitable for identifying and ranking valuable oak habitats were noted. A scoring system was developed to assess oak attributes as an indirect measure of biodiversity. The system provides a relatively accessible identification of valuable oak habitats across three scales: elements with high conservation value (Swedish: värdeelement), valuable core areas (Swedish: värdekärnor), and value areas at landscape scale (Swedish: värdetrakt). Cluster analyses based on four dispersal distances, 119, 200, 477 and 656 meters were conducted to locate valuable core areas. Moreover, a buffer analysis based on a two-kilometer dispersal distance, identified 20 value areas at landscape scale in Stockholm County. These 172 valuable core areas and 20 value areas were then ranked based on the mean scores of the oak trees. Spearman's correlation analysis was performed to examine whether there was a positive correlation between the size of the valuable core areas or value areas and biodiversity, with no positive correlation found. The findings of the study suggest that high nature values are not synonymous with large valuable core areas or value areas, and that small valuable core areas and value areas also can be of great importance.
159

Examination of pathway crosstalk and functional modules in papillary thyroid cancer dedifferentiation to anaplastic thyroid cancer

Theodorou, Maria Panagiota January 2023 (has links)
Thyroid cancer, comprising well-differentiated follicular and papillary types, alongside less common medullary and anaplastic subtypes with poor prognoses, exhibits specific anaplastic cases resulting from papillary dedifferentiation, lacking precise molecular evidence. Utilizing Metascape, CTpathway and PathwAX II, the study integrates functional modules and pathway crosstalk for dedifferentiation analysis, conducting a comprehensive two-dimensional assessment of toolset’s functionality, compatibility, and interoperability. Results suggest that transitions between the cancer subtypes involve pathways related to cellular processes, extracellular matrix interactions, and genetic alterations. Metascape enriched crosstalk tool findings, providing extended lists of specific pathways, while CTpathway exhibited better sensitivity and specificity, offering more result customization options and database selection than PathwAX II. PathwAX II, with unique interactive features for network display and identifying depleted pathways, emerges a valuable component in a comprehensive pipeline integrating these three tools. Additional validation against previous clinical studies affirms the reliability of the results, reinforcing PathwAX II’s role as a key reference point in the creation of such a pipeline. The study also suggests future tool development directions, highlighting strengths and limitations across the platforms. The detailed pathway and gene analysis contributes concrete knowledge to the scientific community, serving as a hallmark for future studies.
160

Mathematical modelling simulation data and artificial intelligence for the study of tumour-macrophage interaction

Chaliha, Jaysmita Khanindra January 2023 (has links)
The study explores the integration of mathematical modelling and machine learning to understand tumour-macrophage interactions in the tumour microenvironment. It details mathematical models based on biochemistry and physics for predicting tumour dynamics, highlighting the role of macrophages. Machine learning, particularly unsupervised and supervised techniques like K-means clustering, logistic regression, and support vector machines, are implemented to analyse simulation data. The thesis's integration of K-means clustering reveals distinct tumour behaviour patterns through the classification of tumour cells based on their microenvironmental interactions. This segmentation is crucial for understanding tumour heterogeneity and its implications for treatment. Additionally, the application of logistic regression provides insights into the probability of macrophage polarization states in the tumour microenvironment. This statistical model underscores the significant factors influencing macrophage behaviour and their consequent impact on tumour progression. These analytical approaches enhance the understanding of the complex dynamics within the tumour microenvironment, contributing to more effective tumour study strategies. The study presents a comprehensive analysis of tumour growth, macrophage polarization, and their impact on cancer treatment and prognosis. Ethical considerations and future directions focus on enhancing model accuracy and integrating experimental data for improved cancer diagnosis and treatment strategies. The thesis concludes with the potential of this hybrid approach in advancing cancer biology and therapeutic approaches. / <p>Det finns övrigt digitalt material (t.ex. film-, bild- eller ljudfiler) eller modeller/artefakter tillhörande examensarbetet som ska skickas till arkivet.</p><p>There are other digital material (eg film, image or audio files) or models/artifacts that belongs to the thesis and need to be archived.</p>

Page generated in 0.0477 seconds