Global ETD Search

81	Machine Learning Approaches to Reveal Discrete Signals in Gene Expression Changlin Wan (12450321) 24 April 2022 (has links) <p>Gene expression is an intricate process that determines different cell types and functions in metazoans, where most of its regulation is communicated through discrete signals, like whether the DNA helix is open, whether an enzyme binds with its target, etc. Understanding the regulation signals of the selective expression process is essential to the full comprehension of biological mechanism and complicated biological systems. In this research, we seek to reveal the discrete signals in gene expression by utilizing novel machine learning approaches. Specifically, we focus on two types of data chromatin conformation capture (3C) and single cell RNA sequencing (scRNA-seq). To identify potential regulators, we utilize a new hypergraph neural network to predict genome interactions, where we find the gene co-regulation may result from the shared enhancer element. To reveal the discrete expression state from scRNA-seq data, we propose a novel model called LTMG that considered the biological noise and showed better goodness of fitting compared with existing models. Next, we applied Boolean matrix factorization to find the co-regulation modules from the identified expression states, where we revealed the general property in cancer cells across different patients. Lastly, to find more reliable modules, we analyze the bias in the data and proposed BIND, the first algorithm to quantify the column- and row-wise bias in binary matrix.</p> Computational Biology Machine Learning Matrix Factorization Discrete data analysis Statistical Modeling
82	Automatic tag suggestions using a deep learning recommender system / Automatiska taggförslag med hjälp av ett rekommendationssystem baserat på djupinlärning Malmström, David January 2019 (has links) This study was conducted to investigate how well deep learning can be applied to the field of tag recommender systems. In the context of an image item, tag recommendations can be given based on tags already existing on the item, or on item content information. In the current literature, there are no works which jointly models the tags and the item content information using deep learning. Two tag recommender systems were developed. The first one was a highly optimized hybrid baseline model based on matrix factorization and Bayesian classification. The second one was based on deep learning. The two models were trained and evaluated on a dataset of user-tagged images and videos from Flickr. A percentage of the tags were withheld, and the evaluation consisted of predicting them. The deep learning model attained the same prediction recall as the baseline model in the main evaluation scenario, when half of the tags were withheld. However, the baseline model generalized better to the sparser scenarios, when a larger number of tags were withheld. Furthermore, the computations of the deep learning model were much more time-consuming than the computations of the baseline model. These results led to the conclusion that the baseline model was more practical, but that there is much potential in using deep learning for the purpose of tag recommendation. / Den här studien genomfördes i syfte att undersöka hur effektivt djupinlärning kan användas för att konstruera rekommendationssystem för taggar. När det gäller bildobjekt så kan taggar rekommenderas baserat på taggar som redan förekommer på objektet, samt på information om objektet. I dagens forskning finns det inte några publikationer som presenterar ett rekommendationssystem baserat på djupinlärning som bygger på att gemensamt använda taggarna och objektsinformationen. I studien har två rekommendationssystem utvecklats. Det första var en referensmodell, ett väloptimerat hybridsystem baserat på matrisfaktorisering och bayesiansk klassificering. Det andra systemet baserades på djupinlärning. De två modellerna tränades och utvärderades på en datamängd med bilder och videor taggade av användare från Flickr. En procentandel av taggarna var undanhållna, och utvärderingen gick ut på att förutsäga dem. Djupinlärningsmodellen gav förutsägelser av samma kvalitet som referensmodellen i det primära utvärderingsscenariot, där hälften av taggarna var undanhållna. Referensmodellen gav dock bättre resultat i de scenarion där alla eller nästan alla taggar var undanhållna. Dessutom så var beräkningarna mycket mer tidskrävande för djupinlärningsmodellen jämfört med referensmodellen. Dessa resultat ledde till slutsatsen att referensmodellen var mer praktisk, men att det finns mycket potential i att använda djupinlärningssystem för att rekommendera taggar. Computer and Information Sciences Data- och informationsvetenskap
83	Improving Food Recipe Suggestions with Hierarchical Classification of Food Recipes / Förbättrande rekommendationer av matrecept genom hierarkisk klassificering av matrecept Fathollahzadeh, Pedram January 2018 (has links) Making personalized recommendations has become a central part in many platforms, and is continuing to grow with more access to massive amounts of data online. Giving recommendations based on the interests of the individual, rather than recommending items that are popular, increases the user experience and can potentially attract more customers when done right. In order to make personalized recommendations, many platforms resort to machine learning algorithms. In the context of food recipes, these machine learning algorithms tend to consist of hybrid methods between collaborative filtering, content-based methods and matrix factorization. Most content-based approaches are ingredient based and can be very fruitful. However, fetching every single ingredient for recipes and processing them can be computationally expensive. Therefore, this paper investigates if clustering recipes according to what cuisine they belong to and what the main protein is can also improve rating predictions compared to when only collaborative filtering and matrix factorization methods are employed. This suggested content-based approach has a structure of a hierarchical classification, where recipes are first clustered into what cuisine group they belong to, then the specific cuisine and finally what the main protein is. The results suggest that the content-based approach can improve the predictions slightly but not significantly, and can help reduce the sparsity of the rating matrix to some extent. However, it suffers from heavily sparse data with respect to how many rating predictions it can give. / Att ge personliga rekommendationer har blivit en central del av många plattformar och fortsätter att bli det då tillgången till stora mängder data har ökat. Genom att ge personliga rekommendationer baserat på användares intressen, istället för att rekommendera det som är populärt, förbättrar användarupplevelsen och kan attrahera fler kunder. För att kunna producera personliga rekommendationer så vänder sig många plattformar till maskininlärningsalgoritmer. När det kommer till matrecept, så brukar dessa maskininlärningsalgoritmer bestå av hybrida metoder som sammanfogar collaborative filtering, innehållsbaserande metoder och matrisfaktorisering. De flesta innehållsbaserande metoderna baseras på ingredienser och har visats vara effektiva. Däremot, så kan det vara kostsamt för datorer att ta hänsyn till varenda ingrediens i varje matrecept. Därför undersöker denna artikel om att klassificera recept hierarkiskt efter matkultur och huvudprotein också kan förbättra rekommendationer när bara collaborative filtering och matrisfaktorisering används. Denna innehållsbaserande metod har en struktur av hierarkisk klassificering, där recept först indelas efter matkultur, specifik matkultur och till slut vad huvudproteinet är. Resultaten visar att innehållsbaserande metoden kan förbättra receptförslagen, men inte på en statistisk signifikant nivå, och kan reducera gleshet i en matris med tillsatta betyg från olika användare med olika recept något. Däremot så påverkas den ansenligt när det är glest med tillgänglighet av data. / Eatit collaborative filtering content-based method matrix factorization recommender systems hierarchical classification food recipes Computer Sciences Datavetenskap (datalogi)
84	Investigating the performance of matrix factorization techniques applied on purchase data for recommendation purposes Holländer, John January 2015 (has links) Automated systems for producing product recommendations to users is a relatively new area within the field of machine learning. Matrix factorization techniques have been studied to a large extent on data consisting of explicit feedback such as ratings, but to a lesser extent on implicit feedback data consisting of for example purchases.The aim of this study is to investigate how well matrix factorization techniques perform compared to other techniques when used for producing recommendations based on purchase data. We conducted experiments on data from an online bookstore as well as an online fashion store, by running algorithms processing the data and using evaluation metrics to compare the results. We present results proving that for many types of implicit feedback data, matrix factorization techniques are inferior to various neighborhood- and association rules techniques for producing product recommendations. We also present a variant of a user-based neighborhood recommender system algorithm \textit{(UserNN)}, which in all tests we ran outperformed both the matrix factorization algorithms and the k-nearest neighbors algorithm regarding both accuracy and speed. Depending on what dataset was used, the UserNN achieved a precision approximately 2-22 percentage points higher than those of the matrix factorization algorithms, and 2 percentage points higher than the k-nearest neighbors algorithm. The UserNN also outperformed the other algorithms regarding speed, with time consumptions 3.5-5 less than those of the k-nearest neighbors algorithm, and several orders of magnitude less than those of the matrix factorization algorithms. Recommender systems Matrix factorization Machine learning Implicit feedback Recommender algorithms Nearest neighbor UserNN Recommendation systems UserKnn Engineering and Technology Teknik och teknologier
85	Topic Analysis of Tweets on the European Refugee Crisis Using Non-negative Matrix Factorization Shen, Chong 01 January 2016 (has links) The ongoing European Refugee Crisis has been one of the most popular trending topics on Twitter for the past 8 months. This paper applies topic modeling on bulks of tweets to discover the hidden patterns within these social media discussions. In particular, we perform topic analysis through solving Non-negative Matrix Factorization (NMF) as an Inexact Alternating Least Squares problem. We accelerate the computation using techniques including tweet sampling and augmented NMF, compare NMF results with different ranks and visualize the outputs through topic representation and frequency plots. We observe that supportive sentiments maintained a strong presence while negative sentiments such as safety concerns have emerged over time. Text Mining Topic Modeling Refugee Crisis Nonnegative Matrix Factorization Alternating Least Squares Linear Algebra Numerical Analysis and Computation Other Applied Mathematics Politics and Social Change Social Media Social Statistics
86	Triple Non-negative Matrix Factorization Technique for Sentiment Analysis and Topic Modeling Waggoner, Alexander A 01 January 2017 (has links) Topic modeling refers to the process of algorithmically sorting documents into categories based on some common relationship between the documents. This common relationship between the documents is considered the “topic” of the documents. Sentiment analysis refers to the process of algorithmically sorting a document into a positive or negative category depending whether this document expresses a positive or negative opinion on its respective topic. In this paper, I consider the open problem of document classification into a topic category, as well as a sentiment category. This has a direct application to the retail industry where companies may want to scour the web in order to find documents (blogs, Amazon reviews, etc.) which both speak about their product, and give an opinion on their product (positive, negative or neutral). My solution to this problem uses a Non-negative Matrix Factorization (NMF) technique in order to determine the topic classifications of a document set, and further factors the matrix in order to discover the sentiment behind this category of product. Machine Learning Sentiment Analysis Topic Modeling Non-negative Matrix Factorization Applied Mathematics Data Science Other Applied Mathematics Other Mathematics Theory and Algorithms
87	詞彙向量的理論與評估基於矩陣分解與神經網絡 / Theory and evaluation of word embedding based on matrix factorization and neural network 張文嘉, Jhang, Wun Jia Unknown Date (has links) 隨著機器學習在越來越多任務中有突破性的發展，特別是在自然語言處理問題上，得到越來越多的關注，近年來，詞向量是自然語言處理研究中最令人興奮的部分之一。在這篇論文中，我們討論了兩種主要的詞向量學習方法。一種是傳統的矩陣分解，如奇異值分解，另一種是基於神經網絡模型（具有負採樣的Skip-gram模型（Mikolov等人提出，2013），它是一種迭代演算法。我們提出一種方法來挑選初始值，透過使用奇異值分解得到的詞向量當作是Skip-gram模型的初始直，結果發現替換較佳的初始值，在某些自然語言處理的任務中得到明顯的提升。 / Recently, word embedding is one of the most exciting part of research in natural language processing. In this thesis, we discuss the two major learning approaches for word embedding. One is traditional matrix factorization like singular value decomposition, the other is based on neural network model (e.g. the Skip-gram model with negative sampling (Mikolov et al., 2013b)) which is an iterative algorithm. It is known that an iterative process is sensitive to initial starting values. We present an approach for implementing the Skip-gram model with negative sampling from a given initial value that is using singular value decomposition. Furthermore, we show that refined initial starting points improve the analogy task and succeed in capturing fine-gained semantic and syntactic regularities using vector arithmetic. 矩陣分解初始值自然語言處理神經網絡 Matrix factorization Initalization Natural language processing Neural network
88	Decomposition methods of NMR signal of complex mixtures : models ans applications Toumi, Ichrak 28 October 2013 (has links) L'objectif de ce travail était de tester des méthodes de SAS pour la séparation des spectres complexes RMN de mélanges dans les plus simples des composés purs. Dans une première partie, les méthodes à savoir JADE et NNSC ont été appliqué es dans le cadre de la DOSY , une application aux données CPMG était démontrée. Dans une deuxième partie, on s'est concentré sur le développement d'un algorithme efficace "beta-SNMF" . Ceci s'est montré plus performant que NNSC pour beta inférieure ou égale à 2. Etant donné que dans la littérature, le choix de beta a été adapté aux hypothèses statistiques sur le bruit additif, une étude statistique du bruit RMN de la DOSY a été faite pour obtenir une image plus complète de nos données RMN étudiées. / The objective of the work was to test BSS methods for the separation of the complex NMR spectra of mixtures into the simpler ones of the pure compounds. In a first part, known methods namely JADE and NNSC were applied in conjunction for DOSY , performing applications for CPMG were demonstrated. In a second part, we focused on developing an effective algorithm "beta- SNMF ". This was demonstrated to outperform NNSC for beta less or equal to 2. Since in the literature, the choice of beta has been adapted to the statistical assumptions on the additive noise, a statistical study of NMR DOSY noise was done to get a more complete picture about our studied NMR data. Spectroscopie RMN DOSY Séparation aveugle des sources (SAS) Parcimonie NMR spectroscopy DOSY Blind source separation (BSS) Non negative matrix factorization (NMF) Sparsity
89	Incorporação de metadados semânticos para recomendação no cenário de partida fria / Incorporation of semantic metadata for recommendation in the cold start scenario Fressato, Eduardo Pereira 06 May 2019 (has links) Com o propósito de auxiliar os usuários no processo de tomada de decisão, diversos tipos de sistemas Web passaram a incorporar sistemas de recomendação. As abordagens mais utilizadas são a filtragem baseada em conteúdo, que recomenda itens com base nos seus atributos, a filtragem colaborativa, que recomenda itens de acordo com o comportamento de usuários similares, e os sistemas híbridos, que combinam duas ou mais técnicas. A abordagem baseada em conteúdo apresenta o problema de análise limitada de conteúdo, o qual pode ser reduzido com a utilização de informações semânticas. A filtragem colaborativa, por sua vez, apresenta o problema da partida fria, esparsidade e alta dimensionalidade dos dados. Dentre as técnicas de filtragem colaborativa, as baseadas em fatoração de matrizes são geralmente mais eficazes porque permitem descobrir as características subjacentes às interações entre usuários e itens. Embora sistemas de recomendação usufruam de diversas técnicas de recomendação, a maioria das técnicas apresenta falta de informações semânticas para representarem os itens do acervo. Estudos na área de sistemas de recomendação têm analisado a utilização de dados abertos conectados provenientes da Web dos Dados como fonte de informações semânticas. Dessa maneira, este trabalho tem como objetivo investigar como relações semânticas computadas a partir das bases de conhecimentos disponíveis na Web dos Dados podem beneficiar sistemas de recomendação. Este trabalho explora duas questões neste contexto: como a similaridade de itens pode ser calculada com base em informações semânticas e; como semelhanças entre os itens podem ser combinadas em uma técnica de fatoração de matrizes, de modo que o problema da partida fria de itens possa ser efetivamente amenizado. Como resultado, originou-se uma métrica de similaridade semântica que aproveita a hierarquia das bases de conhecimento e obteve um desempenho superior às outras métricas na maioria das bases de dados. E também o algoritmo Item-MSMF que utiliza informações semânticas para amenizar o problema de partida fria e obteve desempenho superior em todas as bases de dados avaliadas no cenário de partida fria. / In order to assist users in the decision-making process, several types of web systems started to incorporate recommender systems. The most commonly used approaches are content-based filtering, which recommends items based on their attributes; collaborative filtering, which recommends items according to the behavior of similar users; and hybrid systems that combine both techniques. The content-based approach presents the problem of limited content analysis, which can be reduced by using semantic information. The collaborative filtering, presents the problem of cold start, sparsity and high dimensionality of the data. Among the techniques of collaborative filtering, those based on matrix factorization are generally more effective because they allow us to discover the underlying characteristics of interactions between users and items. Although recommender systems have several techniques, most of them lack semantic information to represent the items in the collection. Studies in this area have analyzed linked open data from the Web of data as source of semantic information. In this way, this work aims to investigate how semantic relationships computed from the knowledge bases available in the Data Web can benefit recommendation systems. This work explores two questions in this context: how the similarity of items can be calculated based on semantic information and; as similarities between items can be combined in a matrix factorization technique, so that the cold start problem of items can be effectively softened. As a result, a semantic similarity metric was developed that leverages the knowledge base hierarchy and outperformed other metrics in most databases. Also the Item-MSMF algorithm that uses semantic information to soften the cold start problem and obtained superior performance in all databases evaluated in the cold start scenario. Cold start Collaborative filtering Dados abertos conectados Fatoração de matrizes Filtragem colaborativa Linked open data Matrix factorization Partida fria Recommender systems Sistemas de recomendação
90	Regularização social em sistemas de recomendação com filtragem colaborativa / Social Regularization in Recommender Systems with Collaborative Filtering Zabanova, Tatyana 14 May 2019 (has links) Modelos baseados em fatoração de matrizes estão entre as implementações mais bem sucedidas de Sistemas de Recomendação. Neste projeto, estudamos as possibilidades de incorporação de informações provindas de redes sociais, para melhorar a qualidade das predições do modelo tanto em modelos tradicionais de Filtragem Colaborativa, quanto em Filtragem Colaborativa Neural. / Models based on matrix factorization are among the most successful implementations of Recommender Systems. In this project, we study the possibilities of incorporating the information from social networks to improve the quality of predictions of the model both in traditional Collaborative Filtering and in Neural Collaborative Filtering. Collaborative filtering Fatoração de matrizes Filtragem colaborativa Filtragem colaborativa neural Matrix factorization Neural collaborative filtering Recommender system Regularização social Sistema de recomendação Social regularization

Search results