• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 6
  • 1
  • 1
  • Tagged with
  • 10
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Deep morphological quantification and clustering of brain cancer cells using phase-contrast imaging

Engberg, Jonas January 2021 (has links)
Glioblastoma Multiforme (GBM) is a very aggressive brain tumour. Previous studies have suggested that the morphological distribution of single GBM cells may hold information about the severity. This study aims to find if there is a potential for automated morphological qualification and clustering of GBM cells and what it shows. In this context, phase-contrast images from 10 different GBMcell cultures were analyzed. To test the hypothesis that morphological differences exist between the cell cultures, images of single GBM cells images were created from an image over the well using CellProfiler and Python. Singlecellimages were passed through multiple different feature extraction models to identify the model showing the most promise for this dataset. The features were then clustered and quantified to see if any differentiation exists between the cell cultures. The results suggest morphological feature differences exist between GBM cell cultures when using automated models. The siamese network managed to construct clusters of cells having very similar morphology. I conclude that the 10 cell cultures seem to have cells with morphological differences. This highlights the importance of future studies to find what these morphological differences imply for the patients' survivability and choice of treatment.
2

Pathological Image Analysis with Supervised and Unsupervised Deep Learning Approaches

Nasrin, Mst Shamima 18 May 2021 (has links)
No description available.
3

Combining Node Embeddings From Multiple Contexts Using Multi Dimensional Scaling

Yandrapally, Aruna Harini 04 October 2021 (has links)
No description available.
4

Decoding communication of non-human species - Unsupervised machine learning to infer syntactical and temporal patterns in fruit-bats vocalizations.

Assom, Luigi January 2023 (has links)
Decoding non-human species communication offers a unique chance to explore alternative intelligence forms using machine learning. This master thesis focuses on discreteness and grammar, two of five linguistic areas machine learning can support, and tackles inferring syntax and temporal structures from bioacoustics data annotated with animal behavior. The problem lies in a lack of species-specific linguistic knowledge, time-consuming feature extraction and availability of limited data; additionally, unsupervised clustering struggles to discretize vocalizations continuous to human perception due to unclear parameter tuning to preprocess audio. This thesis investigates unsupervised learning to generalize deciphering syntax and short-range temporal patterns in continuous-type vocalizations, specifically fruit-bats, to address the research questions: How does dimensionality reduction affect unsupervised manifold learning to quantify size and diversity of the animal repertoire? and How do syntax and temporal structure encode contextual information? An experimental strategy is designed to improve effectiveness of unsupervised clustering for quantifying the repertoire and to investigate linguistic properties with classifiers and sequence mining; acoustic segments are collected from a dataset of fruit-bat vocalizations annotated with behavior. The methodology keeps clustering methods constant while varying dimensionality reduction techniques on spectrograms and their latent representations learnt by Autoencoders. Uniform Manifold Approximation and Projection (UMAP) embeds data into a manifold; density-based clusterings are applied to its embeddings and compared with agglomerative-based labels, used as ground-truth proxy to test robustness of models. Vocalizations are encoded into label sequences. Syntactic rules and short-range patterns in sequences are investigated with classifiers (Support Vector Machines, Random Forests); graph-analytics and prefix-suffix trees. Reducing the temporal dimension of Mel-spectrograms outperformed previous clustering baseline (Silhouette score > 0.5, 95% assignment accuracy). UMAP embeddings from sequential autoencoders showed potential advantages over convolutional autoencoders. The study revealed a repertoire between seven and approximately 20 vocal-units characterized by combinatorial patterns: context-classification achieved F1-score > 0.9 also with permuted sequences; repetition characterized vocalizations of isolated pups. Vocal-unit distributions were significantly different (p < 0.05) across contexts; a truncated-power law (alpha < 2) described the distribution of maximal repetitions. This thesis contributed to unsupervised machine learning in bioacoustics for decoding non-human communication, aiding research in language evolution and animal cognition.
5

全身イメージング質量分析法を用いたデキサメタゾン投与によるマウス胸腺を主軸とする免疫代謝変動の解明 / ゼンシン イメージング シツリョウ ブンセキホウ オ モチイタ デキサメタゾン トウヨ ニヨル マウス キョウセン オ シュジク トスル メンエキ タイシャ ヘンドウ ノ カイメイ

辻 雄大, Yudai Tsuji 22 March 2022 (has links)
博士(理学) / Doctor of Philosophy in Science / 同志社大学 / Doshisha University
6

Predictive maintenance using NLP and clustering support messages

Yilmaz, Ugur January 2022 (has links)
Communication with customers is a major part of customer experience as well as a great source of data mining. More businesses are engaging with consumers via text messages. Before 2020, 39% of businesses already use some form of text messaging to communicate with their consumers. Many more were expected to adopt the technology after 2020[1]. Email response rates are merely 8%, compared to a response rate of 45% for text messaging[2]. A significant portion of this communication involves customer enquiries or support messages sent in both directions. According to estimates, more than 80% of today’s data is stored in an unstructured format (suchas text, image, audio, or video) [3], with a significant portion of it being stated in ambiguous natural language. When analyzing such data, qualitative data analysis techniques are usually employed. In order to facilitate the automated examination of huge corpora of textual material, researchers have turned to natural language processing techniques[4]. Under the light of shared statistics above, Billogram[5] has decided that support messages between creditors and recipients can be mined for predictive maintenance purposes, such as early identification of an outlier like a bug, defect, or wrongly built feature. As one sentence goal definition, Billogram is looking for an answer to ”why are people reaching out to begin with?” This thesis project discusses implementing unsupervised clustering of support messages by benefiting from natural language processing methods as well as performance metrics of results to answer Billogram’s question. The research also contains intent recognition of clustered messages in two different ways, one automatic and one semi-manual, the results have been discussed and compared. LDA and manual intent assignment approach of the first research has 100 topics and a 0.293 coherence score. On the other hand, the second approach produced 158 clusters with UMAP and HDBSCAN while intent recognition was automatic. Creating clusters will help identifying issues which can be subjects of increased focus, automation, or even down-prioritizing. Therefore, this research lands in the predictive maintenance[9] area. This study, which will get better over time with more iterations in the company, also contains the preliminary work for ”labeling” or ”describing”clusters and their intents.
7

Evaluation of Archetypal Analysis and Manifold Learning for Phenotyping of Acute Kidney Injury

Dylan M Rodriquez (10695618) 07 May 2021 (has links)
Disease subtyping has been a critical aim of precision and personalized medicine. With the potential to improve patient outcomes, unsupervised and semi-supervised methods for determining phenotypes of subtypes have emerged with a recent focus on matrix and tensor factorization. However, interpretability of proposed models is debatable. Principal component analysis (PCA), a traditional method of dimensionality reduction, does not impose non-negativity constraints. Thus coefficients of the principal components are, in cases, difficult to translate to real physical units. Non-negative matrix factorization (NMF) constrains the factorization to positive numbers such that representative types resulting from the factorization are additive. Archetypal analysis (AA) extends this idea and seeks to identify pure types, archetypes, at the extremes of the data from which all other data can be expressed as a convex combination, or by proportion, of the archetypes. Using AA, this study sought to evaluate the sufficiency of AKI staging criteria through unsupervised subtyping. Archetype analysis failed to find a direct 1:1 mapping of archetypes to physician staging and also did not provide additional insight into patient outcomes. Several factors of the analysis such as quality of the data source and the difficulty in selecting features contributed to the outcome. Additionally, after performing feature selection with lasso across data subsets, it was determined that current staging criteria is sufficient to determine patient phenotype with serum creatinine at time of diagnosis to be a necessary factor.
8

Regroupement de textes avec des approches simples et efficaces exploitant la représentation vectorielle contextuelle SBERT

Petricevic, Uros 12 1900 (has links)
Le regroupement est une tâche non supervisée consistant à rassembler les éléments semblables sous un même groupe et les éléments différents dans des groupes distincts. Le regroupement de textes est effectué en représentant les textes dans un espace vectoriel et en étudiant leur similarité dans cet espace. Les meilleurs résultats sont obtenus à l’aide de modèles neuronaux qui affinent une représentation vectorielle contextuelle de manière non supervisée. Or, cette technique peuvent nécessiter un temps d’entraînement important et sa performance n’est pas comparée à des techniques plus simples ne nécessitant pas l’entraînement de modèles neuronaux. Nous proposons, dans ce mémoire, une étude de l’état actuel du domaine. Tout d’abord, nous étudions les meilleures métriques d’évaluation pour le regroupement de textes. Puis, nous évaluons l’état de l’art et portons un regard critique sur leur protocole d’entraînement. Nous proposons également une analyse de certains choix d’implémentation en regroupement de textes, tels que le choix de l’algorithme de regroupement, de la mesure de similarité, de la représentation vectorielle ou de l’affinage non supervisé de la représentation vectorielle. Finalement, nous testons la combinaison de certaines techniques ne nécessitant pas d’entraînement avec la représentation vectorielle contextuelle telles que le prétraitement des données, la réduction de dimensionnalité ou l’inclusion de Tf-idf. Nos expériences démontrent certaines lacunes dans l’état de l’art quant aux choix des métriques d’évaluation et au protocole d’entraînement. De plus, nous démontrons que l’utilisation de techniques simples permet d’obtenir des résultats meilleurs ou semblables à des méthodes sophistiquées nécessitant l’entraînement de modèles neuronaux. Nos expériences sont évaluées sur huit corpus issus de différents domaines. / Clustering is an unsupervised task of bringing similar elements in the same cluster and different elements in distinct groups. Text clustering is performed by representing texts in a vector space and studying their similarity in this space. The best results are obtained using neural models that fine-tune contextual embeddings in an unsupervised manner. However, these techniques require a significant amount of training time and their performance is not compared to simpler techniques that do not require training of neural models. In this master’s thesis, we propose a study of the current state of the art. First, we study the best evaluation metrics for text clustering. Then, we evaluate the state of the art and take a critical look at their training protocol. We also propose an analysis of some implementation choices in text clustering, such as the choice of clustering algorithm, similarity measure, contextual embeddings or unsupervised fine-tuning of the contextual embeddings. Finally, we test the combination of contextual embeddings with some techniques that don’t require training such as data preprocessing, dimensionality reduction or Tf-idf inclusion. Our experiments demonstrate some shortcomings in the state of the art regarding the choice of evaluation metrics and the training protocol. Furthermore, we demonstrate that the use of simple techniques yields better or similar results to sophisticated methods requiring the training of neural models. Our experiments are evaluated on eight benchmark datasets from different domains.
9

A Machine Learning Model of Perturb-Seq Data for use in Space Flight Gene Expression Profile Analysis

Liam Fitzpatric Johnson (18437556) 27 April 2024 (has links)
<p dir="ltr">The genetic perturbations caused by spaceflight on biological systems tend to have a system-wide effect which is often difficult to deconvolute into individual signals with specific points of origin. Single cell multi-omic data can provide a profile of the perturbational effects but does not necessarily indicate the initial point of interference within a network. The objective of this project is to take advantage of large scale and genome-wide perturbational or Perturb-Seq datasets by using them to pre-train a generalist machine learning model that is capable of predicting the effects of unseen perturbations in new data. Perturb-Seq datasets are large libraries of single cell RNA sequencing data collected from CRISPR knock out screens in cell culture. The advent of generative machine learning algorithms, particularly transformers, make it an ideal time to re-assess large scale data libraries in order to grasp cell and even organism-wide genomic expression motifs. By tailoring an algorithm to learn the downstream effects of the genetic perturbations, we present a pre-trained generalist model capable of predicting the effects of multiple perturbations in combination, locating points of origin for perturbation in new datasets, predicting the effects of known perturbations in new datasets, and annotation of large-scale network motifs. We demonstrate the utility of this model by identifying key perturbational signatures in RNA sequencing data from spaceflown biological samples from the NASA Open Science Data Repository.</p>
10

Demography of Birch Populations across Scandinavia

Sendrowski, Janek January 2022 (has links)
Boreal forests are particularly vulnerable to climate change, experiencing a much more drastic increase in temperatures and having a limited amount of more northern refugia. The trees making up these vast and important ecosystems already had to adapt previously to environmental pressures brought about by the repeated glaciations during past ice ages. Studying the patterns of adaption of these trees can thus provide valuable insights on how to mitigate future damage. This thesis presents and analyses population structure, demo- graphic history and the distribution of fitness effects (DFE) of the diploid Betula pendula and tetraploid B. pubescens across Scandinavia. Birches–being widespread in boreal forests as well as having great economical importance–constitute superb model species. The analyses of this work confirm the expectations on postglacial population expansion and diploid-tetraploid introgression. They furthermore ascertain the presence of two genetic clusters and a remarkably similar DFE for the species. This work also contributes with a transparent, reproducible and reusable pipeline which facilitates running similar analyses for related species.

Page generated in 0.0303 seconds