Global ETD Search

31	Siamese Neural Networks for Regression: Similarity-BasedPairing and Uncertainty Quantification Zhang, Yumeng January 2022 (has links) Here we present a similarity-based pairing method for generating compound pairs to train a Siamese Neural Network. In comparison with the conventional exhaustive pairing of N2/2 pairs (N being the sizeof the training set), this method results in N-1 pairs, significantly reducing the training time. It exhibits a better prediction performance consistently on the three physicochemical property datasets, using a multilayer perceptron with the ECFP4 fingerprint. We further include into the Siamese Neural Network the pre-trained Chemformer which extracts task-specific chemical features from the input SMILES strings. With the n-shot learning, we propose a means to measure the prediction uncertainty. Our results demonstrate that the higher accuracy is indeed associated with the lower prediction uncertainty. In addition, we discuss implications of the similarity principle in machine learning. Bioinformatics (Computational Biology) Bioinformatik (beräkningsbiologi)
32	Mathematical modelling simulation data and artificial intelligence for the study of tumour-macrophage interaction Chaliha, Jaysmita Khanindra January 2023 (has links) The study explores the integration of mathematical modelling and machine learning to understand tumour-macrophage interactions in the tumour microenvironment. It details mathematical models based on biochemistry and physics for predicting tumour dynamics, highlighting the role of macrophages. Machine learning, particularly unsupervised and supervised techniques like K-means clustering, logistic regression, and support vector machines, are implemented to analyse simulation data. The thesis's integration of K-means clustering reveals distinct tumour behaviour patterns through the classification of tumour cells based on their microenvironmental interactions. This segmentation is crucial for understanding tumour heterogeneity and its implications for treatment. Additionally, the application of logistic regression provides insights into the probability of macrophage polarization states in the tumour microenvironment. This statistical model underscores the significant factors influencing macrophage behaviour and their consequent impact on tumour progression. These analytical approaches enhance the understanding of the complex dynamics within the tumour microenvironment, contributing to more effective tumour study strategies. The study presents a comprehensive analysis of tumour growth, macrophage polarization, and their impact on cancer treatment and prognosis. Ethical considerations and future directions focus on enhancing model accuracy and integrating experimental data for improved cancer diagnosis and treatment strategies. The thesis concludes with the potential of this hybrid approach in advancing cancer biology and therapeutic approaches. / <p>Det finns övrigt digitalt material (t.ex. film-, bild- eller ljudfiler) eller modeller/artefakter tillhörande examensarbetet som ska skickas till arkivet.</p><p>There are other digital material (eg film, image or audio files) or models/artifacts that belongs to the thesis and need to be archived.</p> Bioinformatics (Computational Biology) Bioinformatik (beräkningsbiologi)
33	Little big data - extending plastid genome databases using marine planktonic metagenomes Huber, Thomas M. January 2022 (has links) No description available. Bioinformatics (Computational Biology) Bioinformatik (beräkningsbiologi)
34	Exploring the performance of Conformal Prediction on Chemical Properties and Its Influencing Factors Chen, Yuhang January 2024 (has links) Machine learning has gained much attention and extended to the field of drug discovery. However, due to the uncertainties of the dataset, predictions should be quantitatively analyzed. Conformal prediction is a powerful method for quantifying these uncertainties, generating a predefined confidence level and a corresponding interval within which the true target is anticipated to fall. This paper aims to explore the effects of different chemical representations of SMILES structures for training (chemical descriptors, Morgan fingerprints), machine learning algorithms (k-nearest neighbor, support vector machine, random forest, extreme gradient boosting, and artificial neural network), and different normalization methods (k-nearest neighbor, Mondrian regression) in influencing the conformal prediction results. We find that Morgan fingerprint outperforms chemical descriptors, Mondrian regression outperforms knearest neighbor for one or several values of coverage, and the mean, median, and standard deviation of the output interval. None of the investigated machine learning methods extremely outperforms the other methods. Conformal predictive system, an alternative form of conformal prediction was also investigated to explore its usefulness in drug discovery. Bioinformatics (Computational Biology) Bioinformatik (beräkningsbiologi)
35	The adaptive potential of effectively small and shrinking populations Eriksson, Leonora January 2024 (has links) It is well known that genetic variation and ability to adapt is crucial for the survival of anypopulation. Whether it be about a natural population’s ability to respond to changes in itsenvironment, or a population of livestock’s ability to produce more milk, genetic variation is akey element. Effectively small populations have an increased risk of extinction caused byreduced ability to adapt or respond to selection. Small populations are also more affected bygenetic drift, which can cause deleterious mutations to fixate, reducing the populations’ fitnesspossibly to the point where it is unable to survive. Models describing changes in allelefrequencies in a population under selection can be used to study a population’s response toselection. A limitation to such models is they often assume infinite population size and neglectthe effects of genetic drift, making them unable to implement when working with effectivelysmall populations.Here, an individual-based model of a quantitative trait affected by selection, mutation andgenetic drift is used to study the adaptive potential of effectively small populations. In a series ofsimulations, changes in the trait are explored under directional selection and stabilizingselection with adaptation to one, and several repeated shifts in optimum. Results of simulationinclude that populations under strong directional selection, such as breeding, potentially risklosing all adaptive potential. Results also suggest that effects of strong directional selectionmight be irreversible, even if the strong selective pressure is removed. Bioinformatics (Computational Biology) Bioinformatik (beräkningsbiologi)
36	The Evolutionary History of Picozoa : Phylogenomic inquiries into the plastid-lacking Archaeplastids Wanntorp, Matias January 2024 (has links) No description available. Bioinformatics (Computational Biology) Bioinformatik (beräkningsbiologi)
37	BacIL - En Bioinformatisk Pipeline för Analys av Bakterieisolat / BacIL - A Bioinformatic Pipeline for Analysis of Bacterial Isolates Östlund, Emma January 2019 (has links) Listeria monocytogenes and Campylobacter spp. are bacteria that sometimes can cause severe illness in humans. Both can be found as contaminants in food that has been produced, stored or prepared improperly, which is why it is important to ensure that the handling of food is done correctly. The National Food Agency (Livsmedelsverket) is the Swedish authority responsible for food safety. One important task is to, in collaboration with other authorities, track and prevent food-related disease outbreaks. For this purpose bacterial samples are regularly collected from border control, at food production facilities and retail as well as from suspected food items and drinking water during outbreaks, and epidemiological analyses are employed to determine the type of bacteria present and whether they can be linked to a common source. One part of these epidemiological analyses involve bioinformatic analyses of the bacterial DNA. This includes determination of sequence type and serotype, as well as calculations of similarities between samples. Such analyses require data processing in several different steps which are usually performed by a bioinformatician using different computer programs. Currently the National Food Agency outsources most of these analyses to other authorities and companies, and the purpose of this project was to develop a pipeline that would allow for these analyses to be performed in-house. The result was a pipeline named BacIL - Bacterial Identification and Linkage which has been developed to automatically perform sequence typing, serotyping and SNP-analysis of Listeria monocytogenes as well as sequence typing and SNP-analysis of Campylobacter jejuni, C. coli and C. lari. The result of the SNP-analysisis is used to create clusters which can be used to identify related samples. The pipeline decreases the number of programs that have to be manually started from more than ten to two. bioinformatik pipeline Bioinformatics (Computational Biology) Bioinformatik (beräkningsbiologi)
38	Framtidens biomarkörer : En prioritering av proteinerna i det humana plasmaproteomet Antonsson, Elin, Eulau, William, Fitkin, Louise, Johansson, Jennifer, Levin, Fredrik, Lundqvist, Sara, Palm, Elin January 2019 (has links) In this report, we rank possible protein biomarkers based on different criteria for use in Olink Proteomics’ protein panels. We started off with a list compiled through the Human Plasma Proteome Project (HPPP) and have in different ways used this to obtain the final results. To complete this task we compared the list with Olink’s and its competitors’ protein catalogs, identified diseases beyond Olink’s coverage and the proteins linked with these. We also created a scoring system used to fa- cilitate detection of good biomarkers. From this, we have concluded that Olink should focus on proteins that the competitors have in their catalogs and proteins that can be found in many pathways and are linked with many diseases. From each of the methods used, we have been able to identify a number of proteins that we recommend Olink to investigate further. Biomarkör plasmaproteom Bioinformatics (Computational Biology) Bioinformatik (beräkningsbiologi)
39	Method for recognizing local descriptors of protein structures using Hidden Markov Models Björkholm, Patrik January 2008 (has links) Being able to predict the sequence-structure relationship in proteins will extend the scope of many bioinformatics tools relying on structure information. Here we use Hidden Markov models (HMM) to recognize and pinpoint the location in target sequences of local structural motifs (local descriptors of protein structure, LDPS) These substructures are composed of three or more segments of amino acid backbone structures that are in proximity with each other in space but not necessarily along the amino acid sequence. We were able to align descriptors to their proper locations in 41.1% of the cases when using models solely built from amino acid information. Using models that also incorporated secondary structure information, we were able to assign 57.8% of the local descriptors to their proper location. Further enhancements in performance was yielded when threading a profile through the Hidden Markov models together with the secondary structure, with this material we were able assign 58,5% of the descriptors to their proper locations. Hidden Markov models were shown to be able to locate LDPS in target sequences, the performance accuracy increases when secondary structure and the profile for the target sequence were used in the models. Bioinformatics Bioinformatics (Computational Biology) Bioinformatik (beräkningsbiologi)
40	The role of RFX-target genes in neurodevelopmental and psychiatric disorders Ganesan, Abhishekapriya January 2021 (has links) Neurodevelopmental disorders such as autism spectrum disorder (ASD) and psychiatric disorders, for example, schizophrenia (SCZ) represent a large spectrum of disorders that manifest through cognitive and behavioural problems. ASD and SCZ are both highly heritable, and some phenotypic similarities between ASD and SCZ have sparked an interest in understanding their genetic commonalities. The genetics of both disorders exhibit significant heterogeneity. Developments in genomics and systems biology, continually increases people’s understanding of these disorders. Recently, pathogenic genetic variants in the regulatory factor X (RFX) family of transcription factors have been identified in a number of ASD cases. In this thesis, common genetic variants and expression patterns of genes identified to have a conserved promotor X-Box motif region, a binding site of RFX factors, are studied. Significant common variants identified through expression quantitative trait loci (eQTLs) and genome wide association studies (GWAS) are mapped to the regulatory regions of these genes and analysed for putative enrichment. In addition, single-cell RNA sequencing data is utilised to examine enrichment of cell types having high X-Box gene expression in the developing human cortex. Through the study, genes that have eQTLs or SNPs in the genomic regulatory regions of the X-Box genes have been identified. While there were no eQTLs or GWAS SNPs in the X-Box motifs, in the X-Box promoter regions some common variants were found. By hypergeometric distribution testing and the subsequent p-values obtained, all of these distributions are statistically under-enriched. Further, major cell types in the cortical region with increased expression of the X-Box genes and most expressed genes among these enriched cell types have been identified. Among the 11 cell types seven were found to be enriched for X-Box genes and many of the most expressed genes in these cell-types were similar. A further study into the cell types and genes identified, along with additional systems biological data analysis, could reveal a larger list of X-Box genes involved in ASD and SCZ and the specific roles of these genes. Bioinformatics (Computational Biology) Bioinformatik (beräkningsbiologi) Neurosciences Neurovetenskaper

Search results