• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 170
  • 8
  • Tagged with
  • 182
  • 182
  • 182
  • 172
  • 172
  • 35
  • 34
  • 20
  • 19
  • 18
  • 18
  • 17
  • 16
  • 12
  • 12
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
91

Simulating Artificial Recombination for a Deep Convolutional Autoencoder

Levin, Fredrik January 2021 (has links)
Population structure is an important field of study due to its importance in finding underlying genetics of various diseases.This is why this thesis has looked at a newly presented deep convolutional autoencoder that has been showing promising results when compared to the state-of-the-art method for quantifying genetic similarities within population structure. The main focus was to introduce data augmentation in the form of artificial diploid recombination to this autoencoder in an attempt to increase performance and robustness of the network structure.  The training data for the network consist of arrays containing information about single-nucleotide polymorphisms present in an individual. Each instance of augmented data was simulated by randomising cuts based on the distance between the polymorphisms, and then creating a new array by alternating between the arrays of two randomised original data instances. Several networks were then trained using this data augmentation. The performance of the trained networks was compared to networks trained on only original data using several metrics. Both groups of networks had similar performance for most metrics. The main difference was that networks trained on only original data had a low genotype concordance on simulated data. This indicates an underlying risk using the original networks, which can be overcome by introducing the artificial recombination.
92

Investigation of the gene expression landscape of human skin wounds

Cheung, Yuen Ting January 2021 (has links)
Wound healing is a complex physiological process. Effective wound healing enables the skin barrier function to be restored once the skin is injured. However, due to the complex nature of wounds, the mechanisms underlying tissue repair are still poorly understood. This has hindered the development of treatment for chronic wound, which is posing threat to both human health system and economy. Long non-coding RNAs (lncRNAs) have been identified as important gene expression regulators and to play functional roles in many biological processes.  The aim of this study was to unravel the gene regulatory network in human skin wound healing, in particular, to identify lncRNAs that may play a functional role in skin repair. Here we performed RNA sequencing to profile gene expression in fibroblasts and keratinocytes isolated from matched skin and day-7 acute wounds of five healthy donors. We predicted a total of 1974 and 3444 mRNA–lncRNA correlated pairs in wound fibroblasts and wound keratinocytes, respectively. By integrating the results from gene ontology enrichment and weighted co-expression network analysis, we shortlisted lncRNAs that may play a functional role in human skin wound healing.
93

A spatial analysis of Norwegian spruce cone developmental stages

Orozco, Alina January 2020 (has links)
The Norway spruce Picea abies is an economically important export to the Swedish economy. There are a number of environmental and endogenous factors that impact the generation time of this species meaning that it can take 20-25 years for a tree to mature. The long generation time creates a challenge for plant breeding programs in terms of how genetic mechanisms are able to be studied as well as how quickly trees can be produced for lumber. The characterization of gene expression patterns in the context of special tissue domains is essential to understanding the underlying functions behind complex biological systems and in the case of P. abies may prove more crucial to determining the activation of genes at specific reproductive growth points. There are several techniques available for the analysis of spatial expression profiles, however, the unique high throughput nature coupled to the morphological information provided by Spatial Transcriptomics creates new opportunities for exploratory analysis. Spatial Transcriptomics offers a distinct approach to answering fundamental questions about the genetic mechanisms that regulate reproductive phase change and cone-setting in conifers. This study focuses on spatial gene expression analysis and the integration of de novo transcriptome assembly contigs to confirm the spatial context of putatively discovered genes such as DAL1, DAL2, DAL3, and DAL10 from previous studies and to potentially localize transcripts that could not previously be identified due to the inability to obtain complete transcripts. The aim is to create a workflow to identify genes that contribute to the growth patterns in the naturally occurring acrocona mutant that could prove useful to improving tree breeding programs.
94

Enterprise Search for Pharmacometric Documents : A Feature and Performance Evaluation

Edenståhl, Selma January 2020 (has links)
Information retrieval within a company can be referred to as enterprise search. With the use of enterprise search, employees can find the information they need in company internal data. If a business can take advantage of the knowledge within the organization, it can save time and effort, and be a source for innovation and development within the company.  In this project, two open source search engines, Recoll and Apache Solr, are selected, set up, and evaluated based on requirements and needs at the pharmacometric consulting company Pharmetheus AB. A requirement analysis is performed to collect system requirements at the company. Through a literature survey, two candidate search engines are selected. Lastly, a Proof of Concept is performed to demonstrate the feasibility of the search engines at the company. The search tools are evaluated on criteria including indexing performance, search functionality and configurability. This thesis presents assessment questions to be used when evaluating a search tool. It is shown that the indexing time for both Recoll and Apache Solr appears to scale linearly for less than one hundred thousand pdf documents. The benefit of an index is demonstrated when search times for both search engines greatly outperforms the Linux command-line tools grep and find. It is also explained how the strict folder structure and naming conventions at the company can be used in Recoll to only index specific documents and sub-parts of a file share. Furthermore, I demonstrate how the Recoll web GUI can be modified to include functionality for filtering on document type.  The results show that Recoll meets most of the company’s system requirements and for that reason it could serve as an enterprise search engine at the company. However, the search engine lacks support for authentication, something that has to be further investigated and implemented before the system can be put into production.
95

Identifying Mitochondrial Genomes in Draft Whole-Genome Shotgun Assemblies of Six Gymnosperm Species / Identifiering av mitokondriers arvsmassa från preliminäraversioner av arvsmassan för sex gymnospermer

Eldfjell, Yrin January 2018 (has links)
Sequencing efforts for gymnosperm genomes typically focus on nuclear and chloroplast DNA, with only three complete mitochondrial genomes published as of 2017. The availability of additional mitochondrial genomes would aid biological and evolutionary understanding of gymnosperms. Identifying mtDNA from existing whole genome sequencing (WGS) data (i.e. contigs) negates the need for additional experimental work but previous classification methods show limitations in sensitivity or accuracy, particularly in difficult cases. In this thesis I present a classification pipeline based on (1) kmer probability scoring and (2) SVM classification applied to the available contigs. Using this pipeline the mitochondrial genomes of six gymnosperm species were obtained: Abies sibirica, Gnetum gnemon, Juniperus communis, Picea abies, Pinus sylvestris and Taxus baccata. Cross-validation experiments showed a satisfying and forsome species excellent degree of accuracy. / Vid sekvensering av gymnospermers arvsmassa har fokus oftast lagts på kärn- och kloroplast-DNA. Bara tre fullständiga mitokondriegenom har publicerats hittills (2017). Fler mitokondriegenom skulle kunna leda till nya kunskaper om gymnospermers biologi och evolution. Då mitokondriernas arvsmassa identifieras från tillgängliga sekvenser för hela organismen (så kallade “contiger”) behövs inget ytterligare laboratoriearbete, men detta förfarande har visat sig leda till bristfällig känslighet och korrekthet, särskilt i svåra fall. I denna avhandling presenterar jag en metod baserad på (1) kmer-sannolikheter och (2) SVM-klassificering applicerad på de tillgängliga contigerna. Med denna metod togs arvsmassan för mitokondrien hos sex gymnospermer fram: Abies sibirica, Gnetum gnemon, Juniperus communis, Picea abies, Pinus sylvestris och Taxus baccata. Korsvalideringsexperiment visade en tillfredställande och för vissa arter utmärkt precision.
96

Deep morphological quantification and clustering of brain cancer cells using phase-contrast imaging

Engberg, Jonas January 2021 (has links)
Glioblastoma Multiforme (GBM) is a very aggressive brain tumour. Previous studies have suggested that the morphological distribution of single GBM cells may hold information about the severity. This study aims to find if there is a potential for automated morphological qualification and clustering of GBM cells and what it shows. In this context, phase-contrast images from 10 different GBMcell cultures were analyzed. To test the hypothesis that morphological differences exist between the cell cultures, images of single GBM cells images were created from an image over the well using CellProfiler and Python. Singlecellimages were passed through multiple different feature extraction models to identify the model showing the most promise for this dataset. The features were then clustered and quantified to see if any differentiation exists between the cell cultures. The results suggest morphological feature differences exist between GBM cell cultures when using automated models. The siamese network managed to construct clusters of cells having very similar morphology. I conclude that the 10 cell cultures seem to have cells with morphological differences. This highlights the importance of future studies to find what these morphological differences imply for the patients' survivability and choice of treatment.
97

Towards Individualized Drug Dosage : General Methods and Case Studies

Fransson, Martin January 2007 (has links)
Progress in individualized drug treatment is of increasing importance, promising to avoid much human suffering and reducing medical treatment costs for society. The strategy is to maximize the therapeutic effects and minimize the negative side effects of a drug on individual or group basis. To reach the goal, interactions between the human body and different drugs must be further clarified, for instance by using mathematical models. Whether clinical studies or laboratory experiments are used as primary sources of information, greatly influences the possibilities of obtaining data. This must be considered both prior and during model development and different strategies must be used. The character of the data may also restrict the level of complexity for the models, thus limiting their usage as tools for individualized treatment. In this thesis work two case studies have been made, each with the aim to develop a model for a specific human-drug interaction. The first case study concerns treatment of inflammatory bowel disease with thiopurines, whereas the second is about treatment of ovarian cancer with paclitaxel. Although both case studies make use of similar amounts of experimental data, model development depends considerably on prior knowledge about the systems, the character of the data and the choice of modelling tools. All these factors are presented for each of the case studies along with current results. Further, a system for classifying different but related models is also proposed with the intention that an increased understanding will contribute to advancement in individualized drug dosage. / <p>Report code: LiU-Tek-Lic-2007:41.</p>
98

Predicting morphological effect of compounds on COVID-19 infected cells

Öhrner, Viktor January 2023 (has links)
The cost of developing new drugs is high and the aim of computer-assisted drug discovery is to reduce that development cost, either through virtual screening or generating novel compounds. System biology is one approach to drug discovery where the response of a biological system is the subject of study, instead of drug target interaction. One way to observe a biological system is through microscopy images that are taken of cells perturbed with compounds. Image software extracts information called morphological profiles from the images that can be used for data hungry models. One of the ways artificial intelligence has been applied to drug discovery is with generative models that can generate new compounds. One such generative model is reinforcement learning that employs a critic to guide the generation of compounds towards desirable behaviors. In this study different machine learning models were tested if they could predict the morphological response of COVID-19 infected cells to compounds from their structure. No modells showed any promising results. The reason that no model performed well was because of the dataset. There is a lot of variance in the dataset, meaning that the response to the same compound varies. There was also a lot of difference between the compounds in the dataset, meaning that any representation that the model learns does not transfer over to other compounds. The data set was also imbalanced with more inactive compounds.
99

Developing a reproducible bioinformatics workflow for canine inherited retinal disease

Martin, Melina Toni Marie January 2023 (has links)
Inherited Retinal Degenerations (IRDs) are a heterogenous group of diseases which lead to vision impairment and can be found both in humans and in dogs. About 1 in 1,380 humans is estimated to suffer from an autosomal recessive IRD, which would be 5.5 million people worldwide, and many more are estimated to be unaffected carriers. This makes autosomal recessive IRDs likely the most common group of Mendelian diseases in humans. Today, about 300 genetic mutations have been connected to cause retinal diseases in humans. Whilst in dogs only 32 genes have been identified, numerous eye conditions have been described where the genetic cause has not yet been identified. This suggests that there are much more genetic causes to discover in the dog genome. Additionally, the dog serves well as a model organism to investigate IRDs as it is sharing morphological and genetic similarities with humans. For these reasons, proper software, a canine reference genome of high quality, and smart implementation of bioinformatic tools and methods are a big advantage to increase chances of finding new causative genetic variants and subsequently enable faster detection of possible preventions of the disease or at least alleviating its symptoms via early diagnosis. In this project, a pre-existing pipeline consisting of Bash scripts was stepwise improved with the goal to increase its efficiency. After controlling whether previous data could still be reproduced with the old pipeline in a first step, the software was exchanged to more updated versions in a second step. A main change was the replacement of the mapping tool Burrows-Wheeler Aligner (BWA) from bwa mem to bwa-mem2 mem, and the update of deprecated Genome Analysis Toolkit (GATK) 3.7 to version 4.3 or 4.4. Thirdly, the scripts were adapted from using the older canine reference genome CanFam3.1 to CanFam4. In a fourth step, for automatization and fastening the running time, the pipeline steps were implemented into the workflow management system Nextflow. Additionally, this step was partly aiming to make the pipeline in concordance with the FAIR-principles. All steps were tested on the same test data set, a Labrador retriever family trio, in which one genetic cause for a canine form of the IRD Stargardt disease in a previous study had been detected, namely an insertion in the ABCA4 gene. Lastly, the workflow was also tested on a second data set of a novel IRD of unknown genetic origin on two sibling pairs of Chinese Crested Dogs (CCR). The adjustment of the pipeline shows similar results regarding the change of mapping tool. Introducing the new reference genome revealed a drop of average coverage by one read average for when using CanFam4, while other results were similar. Using the new reference genome increased the number of unknown variants compared to findings with CanFam3.1. However, the known causative variant for the canine form of Stargardt disease, an insertion in ABCA4 gene, could be found in all cases. The run with Nextflow produced identical results to when the respective steps were run with Bash scripts, but it reduced the running time. Running the workflow on the new data set (CCR) and subsequent annotation and filtering indicate new candidates which could be further investigated as a potential cause for this currently unknown cause for an IRD.
100

Predicting and classifying atrial fibrillation from ECG recordings using machine learning

Bogstedt, Carl January 2023 (has links)
Atrial fibrillation is one of the most common types of heart arrhythmias, which can cause irregular, weak and fast atrial contractions up to 600 beats per minute. Atrial fibrillation has increased prevalence with age and is associated with increased risks of ischemia, as blood clots can form due to the weak contractions. During prolonged periods of atrial fibrillation, the atria can undergo a process called atrial remodelling. This causes electrophysiological and structural changes to the atria such as increased atrial size and changes to calcium ion densities. These changes themselves promotes the initiation and propagation of atrial fibrillation, which makes early detection crucial. Fortunately, atrial fibrillation can be detected on an electrocardiogram. Electrocardiograms measures the electrical activity of the heart during its cardiac cycle. This includes the initiation of the action potential, the depolarization of the atria and ventricles and their repolarization. On the electrocardiogram recording, these are seen as peaks and valleys, where each peak and valley can be traced back to one of these events. This means that during atrial fibrillation, the weak, irregular and fast atrial contractions can all be detected and measured. The aim of this project was to develop a machine learning model that could predict onset of atrial fibrillation, and that could classify ongoing atrial fibrillation. This was achieved by training one multiclass classification machine learning model using XGBoost, and three binary classification machine learning models using ROSETTA, on electrocardiogram recordings of people with and without atrial fibrillation. XGBoost is a tree boosting system which uses tree-like structures to classify data, while ROSETTA is a rule-based classification model which creates rules in an IF and THEN format to make decisions. The recordings were labelled according to three different classes: no atrial fibrillation, atrial fibrillation or preceding atrial fibrillation. The XGBoost model had a prediction accuracy of 99.3%, outperforming the three ROSETTA models and other atrial fibrillation classification and prediction models found. The ROSETTA models had high accuracies on the learning set, however, the predictions were subpar, indicating faulty settings for this type of data. The results in this project indicate that the models created can be used to accurately classify and predict onset of and ongoing atrial fibrillation, serving as a tool for early detection and verification of diagnosis.

Page generated in 0.4703 seconds