• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 82
  • 6
  • 3
  • 1
  • Tagged with
  • 119
  • 119
  • 50
  • 46
  • 40
  • 35
  • 35
  • 34
  • 34
  • 30
  • 29
  • 27
  • 27
  • 27
  • 24
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

Adversarial approaches to remote sensing image analysis

Bejiga, Mesay Belete 17 April 2020 (has links)
The recent advance in generative modeling in particular the unsupervised learning of data distribution is attributed to the invention of models with new learning algorithms. Among the methods proposed, generative adversarial networks (GANs) have shown to be the most efficient approaches to estimate data distributions. The core idea of GANs is an adversarial training of two deep neural networks, called generator and discriminator, to learn an implicit approximation of the true data distribution. The distribution is approximated through the weights of the generator network, and interaction with the distribution is through the process of sampling. GANs have found to be useful in applications such as image-to-image translation, in-painting, and text-to-image synthesis. In this thesis, we propose to capitalize on the power of GANs for different remote sensing problems. The first problem is a new research track to the remote sensing community that aims to generate remote sensing images from text descriptions. More specifically, we focus on exploiting ancient text descriptions of geographical areas, inherited from previous civilizations, and convert them the equivalent remote sensing images. The proposed method is composed of a text encoder and an image synthesis module. The text encoder is tasked with converting a text description into a vector. To this end, we explore two encoding schemes: a multilabel encoder and a doc2vec encoder. The multilabel encoder takes into account the presence or absence of objects in the encoding process whereas the doc2vec method encodes additional information available in the text. The encoded vectors are then used as conditional information to a GAN network and guide the synthesis process. We collected satellite images and ancient text descriptions for training in order to evaluate the efficacy of the proposed method. The qualitative and quantitative results obtained suggest that the doc2vec encoder-based model yields better images in terms of the semantic agreement with the input description. In addition, we present open research areas that we believe are important to further advance this new research area. The second problem we want to address is the issue of semi-supervised domain adaptation. The goal of domain adaptation is to learn a generic classifier for multiple related problems, thereby reducing the cost of labeling. To that end, we propose two methods. The first method uses GANs in the context of image-to-image translation to adapt source domain images into target domain images and train a classifier using the adapted images. We evaluated the proposed method on two remote sensing datasets. Though we have not explored this avenue extensively due to computational challenges, the results obtained show that the proposed method is promising and worth exploring in the future. The second domain adaptation strategy borrows the adversarial property of GANs to learn a new representation space where the domain discrepancy is negligible, and the new features are discriminative enough. The method is composed of a feature extractor, class predictor, and domain classifier blocks. Contrary to the traditional methods that perform representation and classifier learning in separate stages, this method combines both into a single-stage thereby learning a new representation of the input data that is domain invariant and discriminative. After training, the classifier is used to predict both source and target domain labels. We apply this method for large-scale land cover classification and cross-sensor hyperspectral classification problems. Experimental results obtained show that the proposed method provides a performance gain of up to 40%, and thus indicates the efficacy of the method.
42

A Study on Multi-Granularity Representation Learning of Time Series Data / 時系列データのマルチグラニュラリティ表現学習に関する研究

Ye, Chengyang 23 January 2024 (has links)
京都大学 / 新制・課程博士 / 博士(情報学) / 甲第25022号 / 情博第854号 / 新制||情||143(附属図書館) / 京都大学大学院情報学研究科社会情報学専攻 / (主査)教授 伊藤 孝行, 教授 神田 崇行, 教授 森 信介, 教授 馬 強(京都工芸繊維大学) / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
43

Exploring adaptation of self-supervised representation learning to histopathology images for liver cancer detection

Jonsson, Markus January 2024 (has links)
This thesis explores adapting self-supervised representation learning to visual domains beyond natural scenes, focusing on medical imaging. The research addresses the central question: “How can self-supervised representation learning be specifically adapted for detecting liver cancer in histopathology images?” The study utilizes the PAIP 2019 dataset for liver cancer segmentation and employs a self-supervised approach based on the VICReg method. The evaluation results demonstrated that the ImageNet-pretrained model achieved superior performance on the test set, with a clipped Jaccard index of 0.7747 at a threshold of 0.65. The VICReg-pretrained model followed closely with a score of 0.7461, while the model initialized with random weights trailed behind at 0.5420. These findings indicate that while ImageNet-pretrained models outperformed VICReg-pretrained models, the latter still captured essential data characteristics, suggesting the potential of self-supervised learning in diverse visual domains. The research attempts to contribute to advancing self-supervised learning in non-natural scenes and provides insights into model pretraining strategies.
44

Representation Learning Based Causal Inference in Observational Studies

Lu, Danni 22 February 2021 (has links)
This dissertation investigates novel statistical approaches for causal effect estimation in observational settings, where controlled experimentation is infeasible and confounding is the main hurdle in estimating causal effect. As such, deconfounding constructs the main subject of this dissertation, that is (i) to restore the covariate balance between treatment groups and (ii) to attenuate spurious correlations in training data to derive valid causal conclusions that generalize. By incorporating ideas from representation learning, adversarial matching, generative causal estimation, and invariant risk modeling, this dissertation establishes a causal framework that balances the covariate distribution in latent representation space to yield individualized estimations, and further contributes novel perspectives on causal effect estimation based on invariance principles. The dissertation begins with a systematic review and examination of classical propensity score based balancing schemes for population-level causal effect estimation, presented in Chapter 2. Three causal estimands that target different foci in the population are considered: average treatment effect on the whole population (ATE), average treatment effect on the treated population (ATT), and average treatment effect on the overlap population (ATO). The procedure is demonstrated in a naturalistic driving study (NDS) to evaluate the causal effect of cellphone distraction on crash risk. While highlighting the importance of adopting causal perspectives in analyzing risk factors, discussions on the limitations in balance efficiency, robustness against high-dimensional data and complex interactions, and the need for individualization are provided to motivate subsequent developments. Chapter 3 presents a novel generative Bayesian causal estimation framework named Balancing Variational Neural Inference of Causal Effects (BV-NICE). Via appealing to the Robinson factorization and a latent Bayesian model, a novel variational bound on likelihood is derived, explicitly characterized by the causal effect and propensity score. Notably, by treating observed variables as noisy proxies of unmeasurable latent confounders, the variational posterior approximation is re-purposed as a stochastic feature encoder that fully acknowledges representation uncertainties. To resolve the imbalance in representations, BV-NICE enforces KL-regularization on the respective representation marginals using Fenchel mini-max learning, justified by a new generalization bound on the counterfactual prediction accuracy. The robustness and effectiveness of this framework are demonstrated through an extensive set of tests against competing solutions on semi-synthetic and real-world datasets. In recognition of the reliability issue when extending causal conclusions beyond training distributions, Chapter 4 argues ascertaining causal stability is the key and introduces a novel procedure called Risk Invariant Causal Estimation (RICE). By carefully re-examining the relationship between statistical invariance and causality, RICE cleverly leverages the observed data disparities to enable the identification of stable causal effects. Concretely, the causal inference objective is reformulated under the framework of invariant risk modeling (IRM), where a population-optimality penalty is enforced to filter out un-generalizable effects across heterogeneous populations. Importantly, RICE allows settings where counterfactual reasoning with unobserved confounding or biased sampling designs become feasible. The effectiveness of this new proposal is verified with respect to a variety of study designs on real and synthetic data. In summary, this dissertation presents a flexible causal inference framework that acknowledges the representation uncertainties and data heterogeneities. It enjoys three merits: improved balance to complex covariate interactions, enhanced robustness to unobservable latent confounders, and better generalizability to novel populations. / Doctor of Philosophy / Reasoning cause and effect is the innate ability of a human. While the drive to understand cause and effect is instinct, the rigorous reasoning process is usually trained through the observation of countless trials and failures. In this dissertation, we embark on a journey to explore various principles and novel statistical approaches for causal inference in observational studies. Throughout the dissertation, we focus on the causal effect estimation which answers questions like ``what if" and ``what could have happened". The causal effect of a treatment is measured by comparing the outcomes corresponding to different treatment levels of the same unit, e.g. ``what if the unit is treated instead of not treated?". The challenge lies in the fact that i) a unit only receives one treatment at a time and therefore it is impossible to directly compare outcomes of different treatment levels; ii) comparing the outcomes across different units may involve bias due to confounding as the treatment assignment potentially follows a systematic mechanism. Therefore, deconfounding constructs the main hurdle in estimating causal effects. This dissertation presents two parallel principles of deconfounding: i) balancing, i.e., comparing difference under similar conditions; ii) contrasting, i.e., extracting invariance under heterogeneous conditions. Chapter 2 and Chapter 3 explore causal effect through balancing, with the former systematically reviews a classical propensity score weighting approach in a conventional data setting and the latter presents a novel generative Bayesian framework named Balancing Variational Neural Inference of Causal Effects(BV-NICE) for high-dimensional, complex, and noisy observational data. It incorporates the advance deep learning techniques of representation learning, adversarial learning, and variational inference. The robustness and effectiveness of the proposed framework are demonstrated through an extensive set of experiments. Chapter 4 extracts causal effect through contrasting, emphasizing that ascertaining stability is the key of causality. A novel causal effect estimating procedure called Risk Invariant Causal Estimation(RICE) is proposed that leverages the observed data disparities to enable the identification of stable causal effects. The improved generalizability of RICE is demonstrated through synthetic data with different structures, compared with state-of-art models. In summary, this dissertation presents a flexible causal inference framework that acknowledges the data uncertainties and heterogeneities. By promoting two different aspects of causal principles and integrating advance deep learning techniques, the proposed framework shows improved balance for complex covariate interactions, enhanced robustness for unobservable latent confounders, and better generalizability for novel populations.
45

Novel document representations based on labels and sequential information

Kim, Seungyeon 21 September 2015 (has links)
A wide variety of text analysis applications are based on statistical machine learning techniques. The success of those applications is critically affected by how we represent a document. Learning an efficient document representation has two major challenges: sparsity and sequentiality. The sparsity often causes high estimation error, and text's sequential nature, interdependency between words, causes even more complication. This thesis presents novel document representations to overcome the two challenges. First, I employ label characteristics to estimate a compact document representation. Because label attributes implicitly describe the geometry of dense subspace that has substantial impact, I can effectively resolve the sparsity issue while only focusing the compact subspace. Second, while modeling a document as a joint or conditional distribution between words and their sequential information, I can efficiently reflect sequential nature of text in my document representations. Lastly, the thesis is concluded with a document representation that employs both labels and sequential information in a unified formulation. The following four criteria are utilized to evaluate the goodness of representations: how close a representation is to its original data, how strongly a representation can be distinguished from each other, how easy to interpret a representation by a human, and how much computational effort is needed for a representation. While pursuing those good representation criteria, I was able to obtain document representations that are closer to the original data, stronger in discrimination, and easier to be understood than traditional document representations. Efficient computation algorithms make the proposed approaches largely scalable. This thesis examines emotion prediction, temporal emotion analysis, modeling documents with edit histories, locally coherent topic modeling, and text categorization tasks for possible applications.
46

Language Image Transformer

January 2020 (has links)
abstract: Humans perceive the environment using multiple modalities like vision, speech (language), touch, taste, and smell. The knowledge obtained from one modality usually complements the other. Learning through several modalities helps in constructing an accurate model of the environment. Most of the current vision and language models are modality-specific and, in many cases, extensively use deep-learning based attention mechanisms for learning powerful representations. This work discusses the role of attention in associating vision and language for generating shared representation. Language Image Transformer (LIT) is proposed for learning multi-modal representations of the environment. It uses a training objective based on Contrastive Predictive Coding (CPC) to maximize the Mutual Information (MI) between the visual and linguistic representations. It learns the relationship between the modalities using the proposed cross-modal attention layers. It is trained and evaluated using captioning datasets, MS COCO, and Conceptual Captions. The results and the analysis offers a perspective on the use of Mutual Information Maximisation (MIM) for generating generalizable representations across multiple modalities. / Dissertation/Thesis / Masters Thesis Computer Engineering 2020
47

Neural Representation Learning for Semi-Supervised Node Classification and Explainability

Hogun Park (9179561) 28 July 2020 (has links)
<div>Many real-world domains are relational, consisting of objects (e.g., users and pa- pers) linked to each other in various ways. Because class labels in graphs are often only available for a subset of the nodes, semi-supervised learning for graphs has been studied extensively to predict the unobserved class labels. For example, we can pre- dict political views in a partially labeled social graph dataset and get expected gross incomes of movies in an actor/movie graph with a few labels. Recently, advances in representation learning for graph data have made great strides for the semi-supervised node classification. However, most of the methods have mainly focused on learning node representations by considering simple relational properties (e.g., random walk) or aggregating nearby attributes, and it is still challenging to learn complex inter- action patterns in partially labeled graphs and provide explanations on the learned representations. </div><div><br></div><div>In this dissertation, multiple methods are proposed to alleviate both challenges for semi-supervised node classification. First, we propose a graph neural network architecture, REGNN, that leverages local inferences for unlabeled nodes. REGNN performs graph convolution to enable label propagation via high-order paths and predicts class labels for unlabeled nodes. In particular, our proposed attention layer of REGNN measures the role equivalence among nodes and effectively reduces the noise, which is generated during the aggregation of observed labels from distant neighbors at various distances. Second, we also propose a neural network archi- tecture that jointly captures both temporal and static interaction patterns, which we call Temporal-Static-Graph-Net (TSGNet). The architecture learns a latent rep- resentation of each node in order to encode complex interaction patterns. Our key insight is that leveraging both a static neighbor encoder, that learns aggregate neigh- bor patterns, and a graph neural network-based recurrent unit, that captures complex interaction patterns, improves the performance of node classification. Lastly, in spite of better performance of representation learning on node classification tasks, neural network-based representation learning models are still less interpretable than the pre- vious relational learning models due to the lack of explanation methods. To address the problem, we show that nodes with high bridgeness scores have larger impacts on node embeddings such as DeepWalk, LINE, Struc2Vec, and PTE under perturbation. However, it is computationally heavy to get bridgeness scores, and we propose a novel gradient-based explanation method, GRAPH-wGD, to find nodes with high bridgeness efficiently. In our evaluations, our proposed architectures (REGNN and TSGNet) for semi-supervised node classification consistently improve predictive performance on real-world datasets. Our GRAPH-wGD also identifies important nodes as global explanations, which significantly change both predicted probabilities on node classification tasks and k-nearest neighbors in the embedding space after perturbing the highly ranked nodes and re-learning low-dimensional node representations for DeepWalk and LINE embedding methods.</div>
48

Self-supervised Representation Learning via Image Out-painting for Medical Image Analysis

January 2020 (has links)
abstract: In recent years, Convolutional Neural Networks (CNNs) have been widely used in not only the computer vision community but also within the medical imaging community. Specifically, the use of pre-trained CNNs on large-scale datasets (e.g., ImageNet) via transfer learning for a variety of medical imaging applications, has become the de facto standard within both communities. However, to fit the current paradigm, 3D imaging tasks have to be reformulated and solved in 2D, losing rich 3D contextual information. Moreover, pre-trained models on natural images never see any biomedical images and do not have knowledge about anatomical structures present in medical images. To overcome the above limitations, this thesis proposes an image out-painting self-supervised proxy task to develop pre-trained models directly from medical images without utilizing systematic annotations. The idea is to randomly mask an image and train the model to predict the missing region. It is demonstrated that by predicting missing anatomical structures when seeing only parts of the image, the model will learn generic representation yielding better performance on various medical imaging applications via transfer learning. The extensive experiments demonstrate that the proposed proxy task outperforms training from scratch in six out of seven medical imaging applications covering 2D and 3D classification and segmentation. Moreover, image out-painting proxy task offers competitive performance to state-of-the-art models pre-trained on ImageNet and other self-supervised baselines such as in-painting. Owing to its outstanding performance, out-painting is utilized as one of the self-supervised proxy tasks to provide generic 3D pre-trained models for medical image analysis. / Dissertation/Thesis / Masters Thesis Computer Science 2020
49

DECEPTIVE REVIEW IDENTIFICATION VIA REVIEWER NETWORK REPRESENTATION LEARNING

Shih-Feng Yang (11502553) 19 December 2021 (has links)
<div><div>With the growth of the popularity of e-commerce and mobile apps during the past decade, people rely on online reviews more than ever before for purchasing products, booking hotels, and choosing all kinds of services. Users share their opinions by posting product reviews on merchant sites or online review websites (e.g., Yelp, Amazon, TripAdvisor). Although online reviews are valuable information for people who are interested in products and services, many reviews are manipulated by spammers to provide untruthful information for business competition. Since deceptive reviews can damage the reputation of brands and mislead customers’ buying behaviors, the identification of fake reviews has become an important topic for online merchants. Among the computational approaches proposed for fake review identification, network-based fake review analysis jointly considers the information from review text, reviewer behaviors, and production information. Researchers have proposed network-based methods (e.g., metapath) on heterogeneous networks, which have shown promising results.</div><div><br></div><div>However, we’ve identified two research gaps in this study: 1) We argue the previous network-based reviewer representations are not sufficient to preserve the relationship of reviewers in networks. To be specific, previous studies only considered first-order proximity, which indicates the observable connection between reviewers, but not second-order proximity, which captures the neighborhood structures where two vertices overlap. Moreover, although previous network-based fake review studies (e.g., metapath) connect reviewers through feature nodes across heterogeneous networks, they ignored the multi-view nature of reviewers. A view is derived from a single type of proximity or relationship between the nodes, which can be characterized by a set of edges. In other words, the reviewers could form different networks with regard to different relationships. 2) The text embeddings of reviews in previous network-based fake review studies were not considered with reviewer embeddings.</div><div><br></div><div>To tackle the first gap, we generated reviewer embeddings via MVE (Qu et al., 2017), a framework for multi-view network representation learning, and conducted spammer classification experiments to examine the effectiveness of the learned embeddings for distinguishing spammers and non-spammers. In addition, we performed unsupervised hierarchical clustering to observe the clusters of the reviewer embeddings. Our results show the clusters generated based on reviewer embeddings capture the difference between spammers and non-spammers better than those generated based on reviewers’ features.</div><div><br></div><div>To fill the second gap, we proposed hybrid embeddings that combine review text embeddings with reviewer embeddings (i.e., the vector that represents a reviewer’s characteristics, such as writing or behavioral patterns). We conducted fake review classification experiments to compare the performance between using hybrid embeddings (i.e., text+reviewer) as features and using text-only embeddings as features. Our results suggest that hybrid embedding is more effective than text-only embedding for fake review identification. Moreover, we compared the prediction performance of the hybrid embeddings with baselines and showed our approach outperformed others on fake review identification experiments.</div><div><br></div><div>The contributions of this study are four-fold: 1) We adopted a multi-view representation learning approach for reviewer embedding learning and analyze the efficacy of the embeddings used for spammer classification and fake review classification. 2) We proposed a hybrid embedding that considers the characteristics of both review text and the reviewer. Our results are promising and suggest hybrid embedding is very effective for fake review identification. 3) We proposed a heuristic network construction approach that builds a user network based on user features. 4) We evaluated how different spammer thresholds impact the performance of fake review classification. Several studies have used the same datasets as we used in this study, but most of them followed the spammer definition mentioned by Jindal and Liu (2008). We argued that the spammer definition should be configurable based on different datasets. Our findings showed that by carefully choosing the spammer thresholds for the target datasets, hybrid embeddings have higher efficacy for fake review classification.</div></div>
50

Detekce střihů a vyhledávání známých scén ve videu s pomocí metod hlubokého učení / Detekce střihů a vyhledávání známých scén ve videu s pomocí metod hlubokého učení

Souček, Tomáš January 2020 (has links)
Video retrieval represents a challenging problem with many caveats and sub-problems. This thesis focuses on two of these sub-problems, namely shot transition detection and text-based search. In the case of shot detection, many solutions have been proposed over the last decades. Recently, deep learning-based approaches improved the accuracy of shot transition detection using 3D convolutional architectures and artificially created training data, but one hundred percent accuracy is still an unreachable ideal. In this thesis we present a deep network for shot transition detection TransNet V2 that reaches state-of- the-art performance on respected benchmarks. In the second case of text-based search, deep learning models projecting textual query and video frames into a joint space proved to be effective for text-based video retrieval. We investigate these query representation learning models in a setting of known-item search and propose improvements for the text encoding part of the model. 1

Page generated in 0.1631 seconds