• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 153
  • 8
  • Tagged with
  • 161
  • 161
  • 161
  • 161
  • 161
  • 12
  • 11
  • 10
  • 10
  • 10
  • 10
  • 10
  • 9
  • 9
  • 9
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
151

Factors Affecting How Well Bacterial Whole Genome Sequencing Reads Assemble

Linda, Mustafa January 2021 (has links)
Recently Whole Genome Sequencing (WGS) has become the new high-resolution tool used to trace the source of foodborne outbreaks. There are often only a few genetic differences that can distinguish closely related bacterial isolates, and variability in data quality between different laboratories may influence the results. In this project, a data set from ten laboratories where the same bacterial samples were sequenced using different library preparation kits and sequencing methods in an interlaboratory study, has been used. Factors that could be responsible for the different performance in terms of how well the raw WGS data from the different labs assembles were investigated. The raw data from the different labs assembled very differently. One lab showed adapter sequences in their reads and filtering them improved the assembly substantially. All labs utilizing the transposase-based library preparation kit Nextera, had base composition bias in the beginning of the reads. For many labs, as the coverage was increased, the number of contigs first increased and then decreased. This was due to low number of contaminating reads from other species. However, these contaminations were barely visible in the plots generated by Kraken/Krona. Filtering out contigs with very low coverage removed this problem. Two labs performed much worse than the others. Some of their reads showed quality drop towards the ends, whereas their data also had the longest read length. However, quality trimming the read ends did not improve the assembly. These two labs had higher GC content in their reads compared to the other labs, the reason for this needs further investigation.
152

Developing Automated Cell Segmentation Models Intended for MERFISH Analysis of the Cardiac Tissue by Deploying Supervised Machine Learning Algorithms / Utveckling av automatiserade cellsegmenteringsmodeller avsedda för MERFISH-analys av hjärtvävnad genom användning av övervakade maskininlärningsalgoritmer

Rune, Julia January 2023 (has links)
Följande studie behandlar utvecklandet av automatiserade cellsegmenteringsmodeller med avsikt att identifiera gränser mellan celler i hjärtvävnad. Syftet är att möjliggöra analys av data genererad från multiplexed error-robust in situ hybridization (MERFISH). MERFISH är en spatial transcriptomics-teknik som till skillnad från exempelvis single-cell RNA sequencing (ScRNA-seq) och single molecule fluorescence in situ hybridization (smFISH), möjliggör profilering av hundratals RNA-sekvenser hos enskilda celler utan att förlora dess rumsliga kontext. I Kosuri laboratoriet på Salk Institute of Biological Studies i San Diego tillämpas MERFISH på mushjärtan. Syftet är att få en djupare insikt i hur celler är organiserade i friska hjärtan, och hur denna struktur ändras i och med åldring och sjukdom. Att extrahera meningsfull information från MERFISH medför dock en betydande utmaning - en exakt cellsegmentering. Studien bidrar följaktligen till utvecklandet av segmenteringsmodeller för att kringgå de utmaningar som står i vägen för all efterföljande analys. Då klassiska segmenteringsalgoritmer är otillräckliga för att segmentera den komplexa vävnad som hjärtat utgörs av, tillämpades några av dagens mest avancerade och framstående maskininlärningsalgoritmer inom fältet, kallade Cellpose och Omnipose. Givet den täta och heterogena hjärtvävnaden, som härstammar från en bred distribution av celltyper och geometrier, utvecklades två separata modeller; en för att täcka både mindre celler och kardiomyocyter skurna på tvärsnittet; och en för att enbart segmentera kardiomyocyter skurna i longitudinell riktning. Den förstnämnda modellen utvecklades och tränades i Cellpose, och uppnådde en träffsäkerhet på 91.2%. Modellen för longitudinella kardiomyocyter utvecklades istället både i Cellpose och Omnipose för att utvärdera vilket nätverk som är bäst lämpat för ändamålet. Ingen av nätverken lyckades uppnå en tillräckligt hög träffsäkerhet för att vara applicerbar, och är därmed i behov av fortsatt träning. Modellen genererad i Omnipose bedöms dock vara mest lovande, givet dess mer heltäckande segmentering. Ytterligare utvecklingsområden för framtiden innefattar segmentering av celler i fibros-täta regioner, samt att utveckla en 3D-segmentering av hela hjärtat för att uppnå en mer komplett MERFISH-analys. Sammanfattningsvis har de genererade segmenteringsmodellerna banat väg för möjliggörandet av en rigorös MERFISH-analys av hjärtat. Genom att avslöja några av de strukturella och funktionella orsakerna till hjärtsvikt på en cellulär nivå, kan vi således på sikt bidra till utvecklingen av mer effektiva terapeutiska strategier. / The following study delves into the development of automated cell segmentation models, with the intention of identifying boundaries between cells in the cardiac tissue for analysing spatial transcriptomics data. Addressing the limitations of alternative techniques like single-cell RNA sequencing (ScRNA-seq) and single molecule fluorescence in situ hybridization (smFISH), the study underscores the innovative use of multiplexed error-robust fluorescence in situ hybridization (MERFISH) deployed by the Kosuri Lab at Salk Institute for Biological Studies. This advanced imaging-based technique allows for a single-cell transcriptome profiling of hundreds of different transcripts while retaining the spatial context of the tissue. The technique can accordingly reveal how the organization of cells within a healthy heart is altered during disease. However, the extraction of meaningful data from MERFISH poses a significant challenge - accurate cell segmentation. This thesis therefore presents the development of a robust model for cell boundary identification within cardiac tissue, leveraging some of the advanced supervised machine learning algorithms in the field, named Cellpose and Omnipose. Due to the dense and highly heterogeneous tissue- stemming from a wide distribution of cell types and shapes- two separate models had to be developed; one that covers the smaller cells and the cross-sectioned cardiomyocytes, and correspondingly one to cover the longitudinal cardiomyocytes. The cross-section model was successfully developed to achieve an accuracy of 91.2%, whereas the longitudinal model still needs further improvements before being implemented. The thesis acknowledges potential areas for improvement, emphasizing the need to further improve the segmentation of longitudinal cardiomyocytes, tackle the challenges with segmenting cells within fibrotic regions of the diseased heart, as well as achieving a precise 3D cell segmentation. Nonetheless, the generated models have paved the way towards enabling efficient downstream MERFISH analysis to ultimately understand the structural and functional dynamics of heart failure at a cellular level, aiding the development of more effective therapeutic strategies.
153

Optimisation of autoencoders for prediction of SNPs determining phenotypes in wheat

Nair, Karthik January 2021 (has links)
The increase in demand for food has resulted in increased demand for tools that help streamline plant breeding process in order to create new varieties of crops. Identifying the underlying genetic mechanism of favourable characteristics is essential in order to make the best breeding decisions. In this project we have developed a modified autoencoder model which allows for lateral phenotype injection into the latent layer, in order to identify causal SNPs for phenotypes of interest in wheat. SNP and phenotype data for 500 samples of Lantmännen SW Seed provided by Lantmännen was used to train the network. Artificial phenotype created using a single SNP was used during training instead of real phenotype, since the relationship between the phenotype and SNP is already known. The modified training model with lateral phenotype injection showed significant increase in genotype concordance of the artificial phenotype when compared to the control model without phenotype injection. Causal SNP was successfully identified by using concordance terrain graph, where the difference in concordance of individual SNPs  between the modified modified model and control model was plotted against the genomic position of each SNP. The model requires further testing to elucidate its behaviour for phenotypes linked to multiple SNPs.
154

ProTargetMiner one step further : Deep comparative proteomics of Dying vs. Surviving cancer cells treated with anticancer compounds

Lundin, Albin January 2022 (has links)
Cancer is a leading cause of mortality worldwide, responsible for nearly one in six deaths. Thus, there is a need for a greater understanding of cancer for the development of novel therapeutics. This master thesis project aims to compare the proteome signatures between dying and surviving cancer cells treated with diverse anticancer drugs. The first aim is to investigate if drug targets behave similarly and have the same sign (up- or down-regulation) in dying versus surviving cells. The second aim is to validate that combining the dying cancer cell’s proteome with the surviving cell’s can help improve drug target rankings for anticancer treatments. The third aim is to identify proteins and pathways involved in life and death decisions by comparing dying and surviving states in response to the anticancer drugs in different cell lines. First, we demonstrate that drug target behaviour in dying versus surviving cells is almost identical for nine diverse anticancer compounds with a correlation of 0.93. To identify drug targets, orthogonal partial least squares-discriminant analysis (OPLS-DA) modelling was performed to contrast the proteome signature of one anticancer drug against all other drugs and rank the proteins based on the magnitude of the model’s predictive component. There were occasions when the dying cells gave better rankings than the surviving ones. In some cases, the best target rankings were obtained when combining the data from both surviving and dying cells. To identify proteins and pathways involved in life and death decisions, OPLS-DA modelling contrasting the two states was performed, and heatmaps and scatterplots of dying and surviving log2 fold changes were made. As a result, several pathways involved in cell survival and cell death were identified. In addition, at least six proteins consistently differentially regulated between the surviving and dying cells were identified. Such proteins can be considered as putative survival (resistance) or sensitivity biomarkers and serve as potential drug targets for the development of novel anticancer agents.
155

Exploring HMGB1 protein-protein interactions in the monocytic cell lineage THP-1.

Tsang, Choi January 2022 (has links)
High mobility group box 1 (HMGB1) was first identified as a chromatin-associated protein and later discovered to initiate and regulate inflammation by inducing cytokine production, cell migration and cell differentiation. HMGB1 forms complexes with a variety of proteins (e.g. C1q, LPS, CXCL12, IL-1a, IL1b, Beclin-6, p53) that in turn play a role in different cellular mechanisms. However, most HMGB1-protein complexes identified are found in the extracellular space whereas intracellular HMGB1-protein complexes are far less defined.  Firstly, data of HMGB1 interactome was previously generated by Rebecka Heinbäck, Erlandsson Harris group at KI. The HMGB1 interactome was identified in resting and in LPS-stressed THP-1 cells using a method called BioID.  The objective was to explore possible intracellular HMGB1 protein-protein interactions during resting and inflammatory conditions. HMGB1 in complex with other proteins have been known to exhibit crucial functions, therefore our investigation can lead to important knowledge in developing promising future therapeutics targeting HMGB1 in addition to further knowledge on intracellular functions of HMGB1. In this project, we used a combination of different computational analysis tools to explore the roles of HMGB1 and its interactome. Thereafter, we selected proteins within the BioID dataset that were further investigated for direct protein-protein interactions with HMGB1 using computational modelling as well as laboratory techniques, such as co-immunoprecipitation.  Our data reveals functional and biological differences of HMGB1 in resting and LPS activated THP-1 cells. Within resting cells, the HMGB1 interactome is involved in transduction and transcription processes whereas under LPS-stressed conditions HMGB1 is indicated in apoptosis, HATs, and processes in antiviral mechanisms, mainly when localised in the cytosol. Additionally, we revealed potential direct interaction of HMGB1 to S100A6 and HCLS1, in which both can induce different functionalities. Finally, we have further explored the interaction possibilities of HMGB1:S100A6 complex to RAGE, where we found interesting, preliminary results that should be further explored.  To conclude, this thesis suggests new direct, intracellular interaction partners to HMGB1 and indicates a shift in the HMGB1 interactome following LPS stress.
156

Analyzing Lower Limb Motion Capture with Smartphone : Possible improvements using machine learning / Analys av rörelsefångst för nedre extremiteterna med smartphone : Möjliga förbättringar med hjälp av maskininlärning

Brink, Anton January 2024 (has links)
Human motion analysis (HMA) can play a crucial role in sports and healthcare by providing unique insights on movement mechanics in the form of objective measurements and quantitative data. Traditional, state of the art, marker-based techniques, despite their accuracy, come with financial and logistical barriers, and are restricted to laboratory settings. Markerless systems offer much improved affordability and portability, and can potentially be used outside of laboratories. However, these advantages come with a significant cost in accuracy. This thesis attempts to address the challenge of democratizing HMA by leveraging recent advances in smartphone technology and machine learning.\newline\newlineThis thesis evaluates two modalities of performing markerless HMA: Single smartphone using Apple Arkit, and multiple smartphone setup using OpenCap, and compares both to a state of the art multiple-camera marker-based system from Vicon. Additionally, this thesis presents and evaluates two approaches to improving the single smartphone modality: Employing a Gaussian Process Model (GPR), and a Long-short-term-memory (LSTM) neural network to refine the single smartphone data to align with the marker-based result. Specific movements were recorded simultaneously with all three modalities on 13 subjects to build a dataset. From this, GPR and LSTM models were trained and applied to refine the single camera modality data. Lower limb joint angles, and joint centers were evaluated across the different modalities, and analyzed for potential use in real-world applications. While the findings of this thesis are promising, as both the GPR and LSTM models improve the accuracy of Apple Arkit, and OpenCap providing accurate and consistent results. It is important to acknowledge limitations regarding demographic diversity and how real-world environmental factors may influence its application. This thesis contributes to the efforts in narrowing the gap between marker-based HMA methods, and more accessible solutions. / Rörelseanalys av människokroppen (HMA) kan spela en betydelsefull roll i både idrott och hälso- och sjukvården. Genom objektiv och kvantitativ data ger den unik insikt i mekaniken bakom rörelser. Traditionella, toppmoderna, markör-baserade tekniker är mycket precisa, men medför finansiella och logistikbaserade barriärer, och finns endast tillgängliga i laboratorier. Markör-fria system erbjuder mycket bättre pris, portabilitet och kan potentiellt användas utanför laboratorier. Dessa fördelar går dock hand i hand med en betydande minskning av nogrannhet. Denna avhandling försöker ta itu med utmaningen att demokratisera HMA genom att utnyttja de senaste framstegen inom smartphoneteknik och maskininlärning. Denna avhandling utvärderar två sätt att utföra markör-fri HMA: Genom att använda en smartphone som kör Apple Arkit, och en uppsättning med flera smartphones som kör OpenCap. Båda modaliteter jämförs med ett markör-baserat system som använder flera kameror, från Vicon. Dessutom presenteras och utvärderas två metoder för att förbättra modaliteten med endast en smartphone: Användning av en Gaussisk Process modell för Regression (GPR) och ett Long-short-term-memory (LSTM) neuronnät för att förbättra data från en smartphone modalititeten, så att det bättre överenstämmer med det markör-baserade resultatet. Specifika rörelser spelades in samtidigt med alla tre modaliteter på 13 försökspersoner för att bygga upp ett dataset. Utifrån detta tränades GPR- och LSTM-modeller och användas för att förbättra data från en kamera modaliteten (Apple Arkit). Ledvinklar och ledcentra för de nedre extremiteterna utvärderades i de olika modaliteterna och analyserades för potentiell använding i verkliga tillämpningar. Även om resultaten av denna avhandling är lovande, då både GPR- och LSTM-modellerna förbättrar nogrannheten hos Apple Arkit, och OpenCap ger korrekta och konsekventa resultat, så är det viktigt att erkänna begränsningarna när det gäller demografisk mångfald och hur miljöfaktorer i verkligheten kan påverka tillämpningen.
157

Searching for novel protein-protein specificities using a combined approach of sequence co-evolution and local structural equilibration

Nordesjö, Olle January 2016 (has links)
Greater understanding of how we can use protein simulations and statistical characteristics of biomolecular interfaces as proxies for biological function will make manifest major advances in protein engineering. Here we show how to use calculated change in binding affinity and coevolutionary scores to predict the functional effect of mutations in the interface between a Histidine Kinase and a Response Regulator. These proteins participate in the Two-Component Regulatory system, a system for intracellular signalling found in bacteria. We find that both scores work as proxies for functional mutants and demonstrate a ~30 fold improvement in initial positive predictive value compared with choosing randomly from a sequence space of 160 000 variants in the top 20 mutants. We also demonstrate qualitative differences in the predictions of the two scores, primarily a tendency for the coevolutionary score to miss out on one class of functional mutants with enriched frequency of the amino acid threonine in one position.
158

Ett sannolikhetsbaserat kvalitetsmått förbättrar klassificeringen av oförväntade sekvenser i in situ sekvensering / A probability-based quality measure improves the classification of unexpected sequences in in situ sequencing

Nordesjö, Olle, Pontén, Victor, Herman, Stephanie, Ås, Joel, Jamal, Sabri, Nyberg, Alona January 2014 (has links)
In situ sekvensering är en metod som kan användas för att lokalisera differentiellt uttryck av mRNA direkt i vävnadssnitt, vilket kan ge viktiga ledtrådar om många sjukdomstillstånd. Idag förloras många av sekvenserna från in situ sekvensering på grund av det kvalitetsmått man använder för att säkerställa att sekvenser är korrekta. Det finns troligtvis möjlighet att förbättra prestandan av den nuvarande base calling-metoden eftersom att metoden är i ett tidigt utvecklingsskede. Vi har genomfört explorativ dataanalys för att undersöka förekomst av systematiska fel och korrigerat för dessa med hjälp av statistiska metoder. Vi har framförallt undersökt tre metoder för att korrigera för systematiska fel: I) Korrektion av överblödning som sker på grund avöverlappande emissionsspektra mellan fluorescenta prober. II) En sannolikhetsbaserad tolkningav intensitetsdata som resulterar i ett nytt kvalitetsmått och en alternativ klassificerare baseradpå övervakad inlärning. III) En utredning om förekomst av cykelberoende effekter, exempelvisofullständig dehybridisering av fluorescenta prober. Vi föreslår att man gör följande saker: Implementerar och utvärderar det sannolikhetsbaserade kvalitetsmåttet Utvecklar och implementerar den föreslagna klassificeraren Genomför ytterligare experiment för att påvisa eller bestrida förekomst av ofullständigdehybridisering / In situ sequencing is a method that can be used to localize differential expression of mRNA directly in tissue sections, something that can give valuable insights to many statest of disease. Today, many of the registered sequences from in situ sequencing are lost due to a conservative quality measure used to filter out incorrect sequencing reads. There is room for improvement in the performance of the current method for base calling since the technology is in an early stage of development. We have performed exploratory data analysis to investigate occurrence of systematic errors, and corrected for these by using various statistical methods. The primary methods that have been investigated are the following: I) Correction of emission spectra overlap resulting in spillover between channels. II) A probability-based interpretation of intensity data, resulting in a novel quality measure and an alternative classifier based on supervised learning. III) Analysis of occurrence of cycle dependent effects, e.g. incomplete dehybridization of fluorescent probes. We suggest the following: Implementation and evaluation of the probability-based quality measure Development and implementation of the proposed classifier Additional experiments to investigate the possible occurrence of incomplete dehybridization
159

Functional association networks for disease gene prediction

Guala, Dimitri January 2017 (has links)
Mapping of the human genome has been instrumental in understanding diseasescaused by changes in single genes. However, disease mechanisms involvingmultiple genes have proven to be much more elusive. Their complexityemerges from interactions of intracellular molecules and makes them immuneto the traditional reductionist approach. Only by modelling this complexinteraction pattern using networks is it possible to understand the emergentproperties that give rise to diseases.The overarching term used to describe both physical and indirect interactionsinvolved in the same functions is functional association. FunCoup is oneof the most comprehensive networks of functional association. It uses a naïveBayesian approach to integrate high-throughput experimental evidence of intracellularinteractions in humans and multiple model organisms. In the firstupdate, both the coverage and the quality of the interactions, were increasedand a feature for comparing interactions across species was added. The latestupdate involved a complete overhaul of all data sources, including a refinementof the training data and addition of new class and sources of interactionsas well as six new species.Disease-specific changes in genes can be identified using high-throughputgenome-wide studies of patients and healthy individuals. To understand theunderlying mechanisms that produce these changes, they can be mapped tocollections of genes with known functions, such as pathways. BinoX wasdeveloped to map altered genes to pathways using the topology of FunCoup.This approach combined with a new random model for comparison enables BinoXto outperform traditional gene-overlap-based methods and other networkbasedtechniques.Results from high-throughput experiments are challenged by noise and biases,resulting in many false positives. Statistical attempts to correct for thesechallenges have led to a reduction in coverage. Both limitations can be remediedusing prioritisation tools such as MaxLink, which ranks genes using guiltby association in the context of a functional association network. MaxLink’salgorithm was generalised to work with any disease phenotype and its statisticalfoundation was strengthened. MaxLink’s predictions were validatedexperimentally using FRET.The availability of prioritisation tools without an appropriate way to comparethem makes it difficult to select the correct tool for a problem domain.A benchmark to assess performance of prioritisation tools in terms of theirability to generalise to new data was developed. FunCoup was used for prioritisationwhile testing was done using cross-validation of terms derived fromGene Ontology. This resulted in a robust and unbiased benchmark for evaluationof current and future prioritisation tools. Surprisingly, previously superiortools based on global network structure were shown to be inferior to a localnetwork-based tool when performance was analysed on the most relevant partof the output, i.e. the top ranked genes.This thesis demonstrates how a network that models the intricate biologyof the cell can contribute with valuable insights for researchers that study diseaseswith complex genetic origins. The developed tools will help the researchcommunity to understand the underlying causes of such diseases and discovernew treatment targets. The robust way to benchmark such tools will help researchersto select the proper tool for their problem domain. / <p>At the time of the doctoral defense, the following papers were unpublished and had a status as follows: Paper 5: Manuscript. Paper 6: Manuscript.</p>
160

Developing new methods for estimating population divergence times from sequence data

Svärd, Karl January 2021 (has links)
Methods for estimating past demographic events of populations are powerful tools in order to get insights of otherwise hidden pasts. The genetic data of people is a valuable resource for these purposes as patterns of variation can inform of the past evolutionary forces and historical events that generated them. There is, however, a lack of methods within the field that uses this information to its full extent. That is why this project has looked at developing a set of new alternatives for estimating demographic events. The work done has been based on modifying the purely sequence based method TTo (Two-Two-outgroup) for estimating divergence times of two populations. The modifications consisted of using beta distributions to model the polymorphic diversity of the ancestral population in order to increase the max sample size possible. The finished project resulted in two implemented methods: TT-beta and a partial variant of MM. TT-beta was able to produce estimations in the same region as TTo and showed that the usage of beta distributions had real potential. For MM there only was a partial implementation able to be done, but this one also showed promise and the ability to use varying sample sizes to estimate demographic values.

Page generated in 0.1984 seconds