• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 165
  • 8
  • Tagged with
  • 173
  • 173
  • 173
  • 173
  • 173
  • 34
  • 33
  • 19
  • 18
  • 17
  • 17
  • 17
  • 16
  • 12
  • 12
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
121

Text Mining Methods for Biomedical Data Analysis / Text Mining Metoder för Biomedicinsk Data Analys

Jabeen, Rakhshanda January 2021 (has links)
Biological data topic modeling has become a very prevalent topic among researchers in recent times. However, analysing countless research papers and gathering consensus regarding biomedicine is a near-impossible task for any researcher due to the complexity and quantity of material that is published. This thesis is devised to focus on two objectives that can help the researchers in this domain based on data related to five major DNA repair pathways. The first objective is to propose an unsupervised approach to examine the hidden structures and analyse research trends in temporal biomedical text data. The second objective is to find DNA repair markers involved in immune defense and retrieve potential PPIs, GIs, and disease-gene associations reported in the literature. We have used latent Dirichlet Allocation (LDA) to discover hidden themes and semantically coherent topics from text. We have clustered the documents based on LDA topic models to analyse the research trend and used the Mann- Kendall test to understand the trends of the topics. Hybridization of text mining methods with classical co-occurrence statistical approach and association rule mining was used to discover potential PPIs, GIs, and disease-gene association in the text. The results for PPIs and GIs were then evaluated with an external biological database of PPIs.
122

Diagnosing intraventricular hemorrhage from brain ultrasound images using machine learning

Dalla Santa, Chiara January 2023 (has links)
No description available.
123

CRISPR-Drawr, a tool to design mutagenic primer

Torbjörn, Larsson January 2023 (has links)
Short open reading frames (sORFs) are codon sequences with a start and stop codon within atmost 100 codons. Cells produce many transcripts from them and some sORFs have been found to have function. sORFs have been associated with embryogenesis, myogenesis, immunity and various diseases including cancers. Cell culture screening is a common method to study function in sORFs. By inserting mutations in known sORF locations one can affect their translation by removing start codons, inserting premature stop codons, or removing native stop codons. A new tool set to do this isCRISPR technology, where single guide RNA (gRNA) can be used to make more precise genome edits. Unfortunately, such design is nontrivial and suggests a lot of variants for testing. It results in a back-and-forth testing process involving different available design tools. In this project, a comprehensive way was developed to see and iterate over the many test combinations. This intends to ease the process and decrease the likelihood for errors. The developed solution is a tool that integrates the currently best design tools. It also introduces a method in the form of a new quality summary score that can evaluate the estimated outcomes of the various designed guide variants. The tool was tested, and it was found that the score simplifies and amplifies the earlier usedscore methods. The pipeline is simple to install and use, integrates the currently most actively developed tools, and an installation is as future proof as can be made in a rapidly evolving field.
124

Characterization of the evolution of satellite DNA across Passeriformes

Martins Borges, Inês January 2022 (has links)
Satellite DNA (satDNA) is among the fastest evolving elements in the genome and is highly abundant in some eukaryotic genomes. Its highly repetitive nature means it is challenging to assemble, and thus underrepresented in most assemblies and often understudied as a result. Birds are an ideal model organism for the study of satDNA and its evolution, since the large amount of available sequenced genomes of this clade allows for dense sampling across various evolutionary timescales, and the low number of satDNA families within their satellitomes facilitates their study and comparison between species. Here, we characterize satDNA and its evolution across Passeriformes, an avian clade containing two-thirds of all bird species spanning ~50 million years of evolution. With this goal we use both short-read data and long-read assemblies of species representative of over 30 passerine families in this clade to shed light on the evolution of its satellitome. We focus on examining the phylogenetic relationships between satellites common to most species as well as characterizing satellite array structure and location in genome assemblies. We also analyse satellite abundance in each genome, focusing on differences in the satellite content between male and female individuals to look for satellites present in the female-specific W sex chromosome and the germline-restricted chromosome. Seven satDNA families shared by a quarter of the species were found, that were likely present in an ancestral species shared by most, if not all the species of Passeriformes. We observed that satDNA evolution is complex and does not follow species phylogeny and that satellite arrays generally have a simple head-to-tail conformation, with evidence in four of the sampled species of satDNA arrays with higher-order repeats. We also found two satDNA families with fairly consistent monomer length and conserved regions that we hypothesise to might be functional.
125

Pipeline for Next Generation Sequencing data of phage displayed libraries to support affinity ligand discovery

Schleimann-Jensen, Ella January 2022 (has links)
Affinity ligands are important molecules used in affinity chromatography for purification of significant substances from complex mixtures. To find affinity ligands specific to important target molecules could be a challenging process. Cytiva uses the powerful phage display technique to find new promising affinity ligands. The phage display technique is a method run in several enrichment cycles. When developing new affinity ligands, a protein scaffold library with a diversity of up to 1010-1011 different protein scaffold variants is run through the enrichment cycles.  The result from the phage display rounds is screened for target molecule binding followed by sequencing, usually with one of the conventional screening methods ELISA or Biacore followed by Sanger sequencing. However, the throughput of these analyses are unfortunately very low, often with only a few hundred screened clones. Therefore, Next Generation Sequencing or NGS, has become an increasingly popular screening method for phage display libraries which generates millions of sequences from each phage display round. This creates a need for a robust data analysis pipeline to be able to interpret the large amounts of data.  In this project, a pipeline for analysis of NGS data of phage displayed libraries has been developed at Cytiva. Cytiva uses NGS as one of their screening methods of phage displayed protein libraries because of the high throughput compared to the conventional screening methods. The purpose is to find new affinity ligands for purification of essential substances used in drugs.  The pipeline has been created using the object-oriented programming language R and consists of several analyses covering the most important steps to be able to find promising results from the NGS data. With the developed pipeline the user can analyze the data on both DNA and protein sequence level and per position residue breakdown, as well as filter the data based on specific amino acids and positions. This gives a robust and thorough analysis which can lead to promising results that can be used in the development of novel affinity ligands for future purification products.
126

Enhancing carbon fixation in Rubisco through generative modelling / Mot en förbättring av kolfixering av Rubisco genom generativ AI

Shute, Ellen January 2024 (has links)
Kolavskiljning, avlägsnande av koldioxid (CO2) från atmosfären, har fått uppmärksamhet som en metod för att mildra effekterna av den globala uppvärmningen. Växter och fototrofa mikroorganismer har den inneboende förmåganatt fånga upp kol genom fixering av CO2 för att producera biomassa. Däremot inhemska kolfixeringsvägar begränsas av nyckelenzymer med låg katalytisk aktivitet vilket resulterar i låg energieffektivitet. Rubisco är en sådan nyckelenzym, ökänt för sin dåliga prestanda. Tidigare forskning har misslyckats när det gäller att förbättra kolet fixering i Rubisco med konventionella metoder. Generativ modellering har dykt upp som en innovativ förhållningssätt till enzymteknik, dra fördel av olika arkitekturer för neurala nätverk för att föreslå en ny varianter med önskade egenskaper. Här tränas en variationsautokodare (VAE) på Rubisco-sekvensen utrymme användes för utmaningen med Rubiscos ingenjörskonst. Två modeller utbildades och med hjälp av dimensionsreduktionsegenskapen hos VAE, utforskades fitnesslandskapet i Rubisco. Sekvenser var märkt med katalytiskt relevanta data och en regressionsmodell byggdes med syftet att förutsäga dessa sekvenser med ökad katalytisk aktivitet. Nya Rubisco-sekvenser genererades efter systematiska utfrågning av det lågdimensionella rummet. Användningen av generativ modellering här ger ett nytt perspektiv på Rubisco engineering. / Carbon capture, the removal of carbon dioxide (CO2) from the atmosphere, has gained attention as a method to mitigate the effects of global warming. Plants and phototrophic microorganisms have the inherent ability to capture carbon through the fixation of CO2 to produce biomass. However, native carbon fixing pathways are limited by key enzymes with low catalytic activity resulting in low energy efficiency. Rubisco is one such key enzyme, notorious for its poor performance. Past research has been unsuccessful at enhancing carbon fixation in Rubisco through conventional methods. Generative modelling has emerged as an innovative approach to enzyme engineering, taking advantage of different neural network architectures to propose novel variants with desired characteristics. Here, a variational autoencoder (VAE) trained on the Rubisco sequence space was applied to the challenge of Rubisco engineering. Two models were trained and, using the dimensionality reduction property of VAEs, the fitness landscape of Rubisco was explored. Sequences were labelled with catalytically relevant data and a regression model was built with the aim of predicting those sequences with enhanced catalytic activity. Novel Rubisco sequences were generated following systematic interrogation of the low-dimensional space. The use of generative modelling here provides a fresh perspective on Rubisco engineering.
127

Data Deconvolution for Drug Prediction

Menacher, Lisa Maria January 2024 (has links)
Treating cancer is difficult as the disease is complex and drug responses often depend on the patient's characteristics. Precision medicine aims to solve this by selecting individualized treatments. Since this involves the analysis of large datasets, machine learning can be used to make the drug selection process more efficient. Traditionally, such models utilize bulk gene expression data. However, this potentially masks information from small cell populations and fails to address tumor heterogeneity. Therefore, this thesis applies data deconvolution methods to bulk gene expression data and estimates the corresponding cell type-specific gene expression profiles. This "increases" the resolution of the input data for the drug response prediction. A hold-out dataset, LODOCV and LOCOCV were used for the evaluation of this approach. Furthermore, all results are compared against a baseline model, which was trained on bulk data. Overall, the accuracy of the cell type-specific model did not show an improvement compared to the bulk model. It also prioritizes information from bulk samples, which makes the additional data unnecessary. The robustness of the cell type-specific model is slightly lower than that of the bulk model. Note, that these outcomes are not necessarily due to a flaw in the underlying concept, but may be connected to poor deconvolution results as the same reference matrix was used for the deconvolution of all bulk samples regardless of the cancer type or disease.
128

2D Modelling of Phytoplankton Dynamics in Freshwater Lakes

Harlin, Hugo January 2019 (has links)
Phytoplankton are single celled organisms capable of phytosynthesis, and are present in all the major oceans and lakes in the world. Phytoplankton contribute to 50% of the total primary production on Earth, and are the dominating primary producer in most aquatic ecosystems. This thesis is based on the 1D deterministic model by Jäger et. al. (2010) which models phytoplankton dynamics in freshwater lakes, where phytoplankton growth is limited by the availability of light and phosphorus. The original model is here extended to two dimensions to include a horizontal dimension as well as a vertical dimension, in order to simulate phytoplankton dynamics under varying lake bottom topographies. The model was solved numerically using a grid transform and a finite volume method in MATLAB. Using the same parameter settings as the 1D case studied by Jäger et. al. (2010), an initial study of plankton dynamics was done by varying the horizontal and vertical diffusion coefficients independently.
129

Validation of a new software for detection of resistance associated substitutions in Hepatitis C-virus

Vigetun Haughey, Caitlin January 2019 (has links)
Hepatitis C infection is a global disease that causes an estimated 399,000 deaths per year. Treatment has improved dramatically in recent years through the development of direct acting antivirals that target specific regions of the Hepatitis C virus (HCV). Unfortunately the virus can have a preexisting resistance or become resistant to these drugs by mutations in the genes that code for the target proteins. These mutations are called resistance-associated substitutions (RASs). Since RASs can cause treatment failure for patients, resistance detection is performed in clinical practice to select the ideal regimen. Currently RASs are detected by using Sanger sequencing and a partly manual workflow that can discriminate the presence of a RAS if it is present in 15-20% of viruses in a patients blood. A new method with the capacity to detect lower ratios of RASs in HCV sequences was developed, which utilizes Pacific Biosciences’ (PacBio’s) sequencing and a bioinformatics analysis software called CLAMP. To validate this new approach, 123 HCV patient samples were sequenced with both methods and then analyzed. The RASs detected with the new method were congruent to what was found with the Sanger-based workflow. The new approach was also shown to correctly genotype the virus samples, identify any co-existing mutations on the same sequences, and detect if there were any mixed genotype infections in the samples. The new procedure was found to be a valid replacement for the Sanger based workflow, with the possibility to perform additional analyses and perform automated and time efficient RAS detection.
130

Characterising copy number polymorphisms using next generation sequencing data

Li, Zhiwei January 2019 (has links)
We developed a pipeline to identify the copy number polymorphisms (CNPs) in the Northern Swedish population using whole genome sequencing (WGS) data. Two different methodologies were applied to discover CNPs in more than 1,000 individuals. We also studied the association between the identified CNPs with the expression level of 438 plasma proteins collected in the same population. The identified CNPs were summarized and filtered as a population copy number matrix for 1,021 individuals in 243,987 non-overlapping CNP loci. For the 872 individuals with both WGS and plasma protein biomarkers data, we conducted linear regression analyses with age and sex as covariance. From the analyses, we detected 382 CNP loci, clustered in 30 collapsed copy number variable regions (CNVRs) that were significantly associated with the levels of 17 plasma protein biomarkers (p < 4.68×10-10).

Page generated in 0.0604 seconds