Global ETD Search

11	SCALABLE REPRESENTATION LEARNING WITH INVARIANCES Changping Meng (8802956) 07 May 2020 (has links) <div><br></div><div><p>In many complex domains, the input data are often not suited for the typical vector representations used in deep learning models. For example, in knowledge representation, relational learning, and some computer vision tasks, the data are often better represented as graphs or sets. In these cases, a key challenge is to learn a representation function which is invariant to permutations of set or isomorphism of graphs. </p><p>In order to handle graph isomorphism, this thesis proposes a subgraph pattern neural network with invariance to graph isomorphisms and varying local neighborhood sizes. Our key insight is to incorporate the unavoidable dependencies in the training observations of induced subgraphs into both the input features and the model architecture itself via high-order dependencies, which are still able to take node/edge labels into account and facilitate inductive reasoning. </p><p>In order to learn permutation-invariant set functions, this thesis shows how the characteristics of an architecture’s computational graph impact its ability to learn in contexts with complex set dependencies, and demonstrate limitations of current methods with respect to one or more of these complexity dimensions. I also propose a new Self-Attention GRU architecture, with a computation graph that is built automatically via self-attention to minimize average interaction path lengths between set elements in the architecture’s computation graph, in order to effectively capture complex dependencies between set elements.</p><p>Besides the typical set problem, a new problem of representing sets-of-sets (SoS) is proposed. In this problem, multi-level dependence and multi-level permutation invariance need to be handled jointly. To address this, I propose a hierarchical sequence attention framework (HATS) for inductive set-of-sets embeddings, and develop the stochastic optimization and inference methods required for efficient learning.</p></div> Applied Computer Science representation learning maching learning graph set invariance
12	Towards Quality and General Knowledge Representation Learning Tang, Zhenwei 03 1900 (has links) Knowledge representation learning (KRL) has been a long-standing and challenging topic in artificial intelligence. Recent years have witnessed the rapidly growing research interest and industrial applications of KRL. However, two important aspects of KRL remains unsatisfactory in the academia and industries, i.e., the quality and the generalization capabilities of the learned representations. This thesis presents a set of methods target at learning high quality distributed knowledge representations and further empowering the learned representations for more general reasoning tasks over knowledge bases. On the one hand, we identify the false negative issue and the data sparsity issue in the knowledge graph completion (KGC) task that can limit the quality of the learned representations. Correspondingly, we design a ranking-based positive-unlabeled learning method along with an adversarial data augmentation strategy for KGC. Then we unify them seamlessly to improve the quality of the learned representations. On the other hand, although recent works expand the supported neural reasoning tasks remarkably by answering multi-hop logical queries, the generalization capabilities are still limited to inductive reasoning tasks that can only provide entity-level answers. In fact, abductive reasoning that provides concept-level answers to queries is also in great need by online users and a wide range of downstream tasks. Therefore, we design a joint abductive and inductive knowledge representation learning and reasoning system by incorporating, representing, and operating on concepts. Extensive experimental results along with case studies demonstrate the effectiveness of our methods in improving the quality and generalization capabilities of the learned distributed knowledge representations. Knowledge Representation Learning Fuzzy Description Logic Positive-unlabeled Learning
13	Latent Feature Models for Uncovering Human Mobility Patterns from Anonymized User Location Traces with Metadata Alharbi, Basma Mohammed 10 April 2017 (has links) In the mobile era, data capturing individuals’ locations have become unprecedentedly available. Data from Location-Based Social Networks is one example of large-scale user-location data. Such data provide a valuable source for understanding patterns governing human mobility, and thus enable a wide range of research. However, mining and utilizing raw user-location data is a challenging task. This is mainly due to the sparsity of data (at the user level), the imbalance of data with power-law users and locations check-ins degree (at the global level), and more importantly the lack of a uniform low-dimensional feature space describing users. Three latent feature models are proposed in this dissertation. Each proposed model takes as an input a collection of user-location check-ins, and outputs a new representation space for users and locations respectively. To avoid invading users privacy, the proposed models are designed to learn from anonymized location data where only IDs - not geophysical positioning or category - of locations are utilized. To enrich the inferred mobility patterns, the proposed models incorporate metadata, often associated with user-location data, into the inference process. In this dissertation, two types of metadata are utilized to enrich the inferred patterns, timestamps and social ties. Time adds context to the inferred patterns, while social ties amplifies incomplete user-location check-ins. The first proposed model incorporates timestamps by learning from collections of users’ locations sharing the same discretized time. The second proposed model also incorporates time into the learning model, yet takes a further step by considering time at different scales (hour of a day, day of a week, month, and so on). This change in modeling time allows for capturing meaningful patterns over different times scales. The last proposed model incorporates social ties into the learning process to compensate for inactive users who contribute a large volume of incomplete user-location check-ins. To assess the quality of the new representation spaces for each model, evaluation is done using an external application, social link prediction, in addition to case studies and analysis of inferred patterns. Each proposed model is compared to baseline models, where results show significant improvements. mobility pattern inference graphical models mixed-membership models Representation learning
14	Learning 3D Shape Representations for Reconstruction and Modeling Biao, Zhang 04 1900 (has links) Neural fields, also known as neural implicit representations, are powerful for modeling 3D shapes. They encode shapes as continuous functions mapping 3D coordinates to scalar values like the signed distance function (SDF) or occupancy probability. Neural fields represent complex shapes using an MLP. The MLP takes spatial coordinates, undergoes nonlinear transformations, and approximates the continuous function of the neural field. During training, the MLP's weights are learned through backpropagation. This PhD thesis presents novel methods for shape representation learning and generation with neural fields. The first part introduces an interpretable and high-quality reconstruction method for neural fields. A neural network predicts labeled points, improving surface visualization and interpretability. The method achieves accurate reconstruction even with rendered image input. A binary classifier, based on predicted labeled points, represents the shape's surface with precision. The second part focuses on shape generation, a challenge in generative modeling. Complex data structures like oct-trees or BSP-trees are challenging to generate with neural networks. To address this, a two-step framework is proposed: an autoencoder compresses the neural field into a fixed-size latent space, followed by training generative models within that space. Incorporating sparsity into the shape autoencoding network reduces dimensionality while maintaining high-quality shape reconstruction. Autoregressive transformer models enable the generation of complex shapes with intricate details. This research explores the potential of denoising diffusion models for 3D shape generation. The latent space efficiency is improved by further compression, leading to more efficient and effective generation of high-quality shapes. Remarkable shape reconstruction results are achieved, even without sparse structures. The approach combines the latest generative model advancements with novel techniques, advancing the field. It has the potential to revolutionize shape generation in gaming, manufacturing, and beyond. In summary, this PhD thesis proposes novel methods for shape representation learning, generation, and reconstruction. It contributes to the field of shape analysis and generation by enhancing interpretability, improving reconstruction quality, and pushing the boundaries of efficient and effective 3D shape generation. deep learning shape analysis generative models representation learning neural fields
15	Perceptual facial expression representation Mikheeva, Olga January 2017 (has links) Facial expressions play an important role in such areas as human communication or medical state evaluation. For machine learning tasks in those areas, it would be beneficial to have a representation of facial expressions which corresponds to human similarity perception. In this work, the data-driven approach to representation learning of facial expressions is taken. The methodology is built upon Variational Autoencoders and eliminates the appearance-related features from the latent space by using neutral facial expressions as additional inputs. In order to improve the quality of the learned representation, we modify the prior distribution of the latent variable to impose the structure on the latent space that is consistent with human perception of facial expressions. We conduct the experiments on two datasets and the additionally collected similarity data, show that the human-like topology in the latent representation helps to improve the performance on the stereotypical emotion classification task and demonstrate the benefits of using a probabilistic generative model in exploring the roles of latent dimensions through the generative process. / Ansiktsuttryck spelar en viktig roll i områden som mänsklig kommunikation eller vid utvärdering av medicinska tillstånd. För att tillämpa maskininlärning i dessa områden skulle det vara fördelaktigt att ha en representation av ansiktsuttryck som bevarar människors uppfattning av likhet. I det här arbetet används ett data-drivet angreppssätt till representationsinlärning av ansiktsuttryck. Metodologin bygger på s. k. Variational Autoencoders och eliminerar utseende-relaterade drag från den latenta rymden genom att använda neutrala ansiktsuttryck som extra input-data. För att förbättra kvaliteten på den inlärda representationen så modifierar vi a priori-distributionen för den latenta variabeln för att ålägga den struktur på den latenta rymden som är överensstämmande med mänsklig perception av ansiktsuttryck. Vi utför experiment på två dataset och även insamlad likhets-data och visar att den människolika topologin i den latenta representationen hjälper till att förbättra prestandan på en typisk emotionsklassificeringsuppgift samt fördelarna med att använda en probabilistisk generativ modell när man undersöker latenta dimensioners roll i den generativa processen. representation learning facial expression variational autoencoder Computer Sciences Datavetenskap (datalogi)
16	The Viability of Cluster Based Representations for Classification of Over the Counter Derivative Populations / Lämpligheten hos klustringsbaserade representationer av derivatkontraktspopulationer för klassificiering Nordberg, Marcus January 2017 (has links) A population of financial derivatives can be compressed if a subset of derivatives yield a net cash flow that lies within a given tolerance level between the parties involved. To conduct a correct population compression, it is essential that all derivatives of the involved parties are present in the derivative set. The current state-of-the-art to ensure this is to have analysts with domain expertise analyzing the populations with the use of assisting tools. The purpose of this project was to automate this process through the use of machine learning classification. Different ways of using clustering for representing a collection of derivatives was implemented and evaluated. The first representation derives from a clustering of all derivatives across populations, describing the distribution of the derivatives across the clusters. A second representation uses the previously mentioned clustering to instead find the distance from a population to all the clusters to form a vector. These representations were compared to two naive representations, one where the mean derivative of a population is used as representation and one where a random clustering is used to find a distribution. The representations were evaluated through classification, using three different classification models (Support Vector Machine, Decision Tree, and a Naive Bayes' Classifier). Different models were tested to examine whether the representations generalize across models. Both the proposed representations were found to be comparable with the naive representations, indicating that the representations fail to capture the characteristics of missing derivatives. The cause of this was found to be that populations of derivatives vary too much for clustering to be consistent enough across populations. / En population av finansiella derivat kan komprimeras om en delmängd av derivat ger ett nettokassaflöde mellan de berörda parterna som ligger inom ett givet toleransintervall. För att göra en korrekt kompression är det viktigt att alla derivat med de involverade parterna finns närvarande i derivatuppsättningen. I nuläget används analytiker som med domänkompetens och erfarenhet kan analysera populationen med hjälp utav verktyg. Syftet med detta projekt var att undersöka om det är möjligt att automatisera denna process genom att använda maskininlärningsklassificering. Olika sätt att använda klustring för att representera en samling derivat implementerades och utvärderades. Den första representationen klustrar alla derivat över populationer och representerar en population med en vektor som beskriver fördelningen av derivaten över kluster. En andra representation använder den tidigare nämnda klustringen för att istället hitta avståndet från populationen som ska representeras till alla kluster för att bilda en vektor. Dessa representationer jämfördes med två naiva representationer, en där det genomsnittliga derivatet av en population används som representation och en där en slumpmässig klustring används för att hitta en distribution likt den först beskrivna representationen. Representationerna utvärderades genom klassificering med tre olika klassificeringsmodeller (stödvektormaskiner, beslutsträd och en naiv Bayesklassificierare). Olika modeller testades för att utvärdera hur representationerna generaliserar över modeller. Båda de föreslagna representationerna visade sig prestera i linje med de naiva representationerna, vilket indikerar att representationerna misslyckas med att fånga kännetecknen för saknade derivat. Orsaken till detta tycks vara att varje uppsättning av derivat är så unik att klustring av derivaten blir för olik baserat på vilken uppsättning man använder. representation learning clustering otc derivatives classification Computer Sciences Datavetenskap (datalogi)
17	Towards Explainable Event Detection and Extraction Mehta, Sneha 22 July 2021 (has links) Event extraction refers to extracting specific knowledge of incidents from natural language text and consolidating it into a structured form. Some important applications of event extraction include search, retrieval, question answering and event forecasting. However, before events can be extracted it is imperative to detect events i.e. identify which documents from a large collection contain events of interest and from those extracting the sentences that might contain the event related information. This task is challenging because it is easier to obtain labels at the document level than finegrained annotations at the sentence level. Current approaches for this task are suboptimal because they directly aggregate sentence probabilities estimated by a classifier to obtain document probabilities resulting in error propagation. To alleviate this problem we propose to leverage recent advances in representation learning by using attention mechanisms. Specifically, for event detection we propose a method to compute document embeddings from sentence embeddings by leveraging attention and training a document classifier on those embeddings to mitigate the error propagation problem. However, we find that existing attention mechanisms are inept for this task, because either they are suboptimal or they use a large number of parameters. To address this problem we propose a lean attention mechanism which is effective for event detection. Current approaches for event extraction rely on finegrained labels in specific domains. Extending extraction to new domains is challenging because of difficulty of collecting finegrained data. Machine reading comprehension(MRC) based approaches, that enable zero-shot extraction struggle with syntactically complex sentences and long-range dependencies. To mitigate this problem, we propose a syntactic sentence simplification approach that is guided by MRC model to improve its performance on event extraction. / Doctor of Philosophy / Event extraction is the task of extracting events of societal importance from natural language texts. The task has a wide range of applications from search, retrieval, question answering to forecasting population level events like civil unrest, disease occurrences with reasonable accuracy. Before events can be extracted it is imperative to identify the documents that are likely to contain the events of interest and extract the sentences that mention those events. This is termed as event detection. Current approaches for event detection are suboptimal. They assume that events are neatly partitioned into sentences and obtain document level event probabilities directly from predicted sentence level probabilities. In this dissertation, under the same assumption by leveraging representation learning we mitigate some of the shortcomings of the previous event detection methods. Current approaches to event extraction are only limited to restricted domains and require finegrained labeled corpora for their training. One way to extend event extraction to new domains in by enabling zero-shot extraction. Machine reading comprehension (MRC) based approach provides a promising way forward for zero-shot extraction. However, this approach suffers from the long-range dependency problem and faces difficulty in handling syntactically complex sentences with multiple clauses. To mitigate this problem we propose a syntactic sentence simplification algorithm that is guided by the MRC system to improves its performance. deep learning natural language processing information extraction representation learning
18	Evaluating Multi-Agent Modeller Representations Demke, Jonathan 15 November 2022 (has links) The way a multi-agent modeller represents an agent not only affects its ability to reason about agents but also the interpretability of its representation space as well as its efficacy on future downstream tasks. We utilize and repurpose metrics from the field of representation learning to specifically analyze and compare multi-agent modellers that build real-valued vector representations of the agents they model. By generating two datasets and analyzing the representations of multiple LSTM- or transformer-based modellers with various embedding sizes, we demonstrate that representation metrics provide a more complete and nuanced picture of a modeller's representation space than an analysis based only on performance. We also provide insights regarding LSTM- and transformer-based representations. Our proposed metrics are general enough to work on a wide variety of modellers and datasets. multi-agent modelling representation learning metrics Physical Sciences and Mathematics
19	Static Branch Prediction through Representation Learning / Statisk Branch Prediction genom Representation Learning Alovisi, Pietro January 2020 (has links) In the context of compilers, branch probability prediction deals with estimating the probability of a branch to be taken in a program. In the absence of profiling information, compilers rely on statically estimated branch probabilities, and state of the art branch probability predictors are based on heuristics. Recent machine learning approaches learn directly from source code using natural language processing algorithms. A representation learning word embedding algorithm is built and evaluated to predict branch probabilities on LLVM’s intermediate representation (IR) language. The predictor is trained and tested on SPEC’s CPU 2006 benchmark and compared to state-of-the art branch probability heuristics. The predictor obtains a better miss rate and accuracy in branch prediction than all the evaluated heuristics, but produces and average null performance speedup over LLVM’s branch predictor on the benchmark. This investigation shows that it is possible to predict branch probabilities using representation learning, but more effort must be put in obtaining a predictor with practical advantages over the heuristics. / Med avseende på kompilatorer, handlar branch probability prediction om att uppskatta sannolikheten att en viss förgrening kommer tas i ett program. Med avsaknad av profileringsinformation förlitar sig kompilatorer på statiskt upp- skattade branch probabilities och de främsta branch probability predictors är baserade på heuristiker. Den senaste maskininlärningsalgoritmerna lär sig direkt från källkod genom algoritmer för natural language processing. En algoritm baserad på representation learning word embedding byggs och utvärderas för branch probabilities prediction på LLVM’s intermediate language (IR). Förutsägaren är tränad och testad på SPEC’s CPU 2006 riktmärke och jämförd med de främsta branch probability heuristikerna. Förutsägaren erhåller en bättre frekvens av missar och träffsäkerhet i sin branch prediction har jämförts med alla utvärderade heuristiker, men producerar i genomsnitt ingen prestandaförbättring jämfört med LLVM’s branch predictor på riktmärket. Den här undersökningen visar att det är möjligt att förutsäga branch prediction probabilities med användande av representation learning, men att det behöver satsas mer på att få tag på en förutsägare som har praktiska övertag gentemot heuristiken. compiler compiler optimization branch prediction machine learning representation learning LLVM kompilator kompilatoroptimering branch prediction maskininlärning representation learning LLVM. Computer and Information Sciences Data- och informationsvetenskap
20	Representation learning with a temporally coherent mixed-representation Parkinson, Jon January 2017 (has links) Guiding a representation towards capturing temporally coherent aspects present invideo improves object identity encoding. Existing models apply temporal coherenceuniformly over all features based on the assumption that optimal encoding of objectidentity only requires temporally stable components. We test the validity of this assumptionby exploring the effects of applying a mixture of temporally coherent invariantfeatures, alongside variable features, in a single 'mixed' representation. Applyingtemporal coherence to different proportions of the available features, we evaluate arange of models on a supervised object classification task. This series of experimentswas tested on three video datasets, each with a different complexity of object shape andmotion. We also investigated whether a mixed-representation improves the capture ofinformation components associated with object position, alongside object identity, ina single representation. Tests were initially applied using a single layer autoencoderas a test bed, followed by subsequent tests investigating whether similar behaviouroccurred in the more abstract features learned by a deep network. A representationapplying temporal coherence in some fashion produced the best results in all tests,on both single layered and deep networks. The majority of tests favoured a mixed representation,especially in cases where the quantity of labelled data available to thesupervised task was plentiful. This work is the first time a mixed-representation hasbeen investigated, and demonstrates its use as a method for representation learning. 006.3

Search results