Global ETD Search

211	Pavages : périodicité et complexité calculatoire Vanier, Pascal 22 November 2012 (has links) Cette thèse est dédiée à l'étude des pavages : des ensembles de coloriages du plan discret respectant des contraintes locales données par un jeu de tuiles. Nous nous penchons en particulier sur les liens qui unissent les pavages et la calculabilité. Les pavages étant des ensembles effectivement clos particuliers, nous étudions dans un premier temps la structure des ensembles de degrés Turing des pavages, la comparant à celle des ensembles effectivement clos en général : pour tout ensemble effectivement clos il existe un pavage qui a les même degrés Turing à 0 près, le degré des ensembles récursifs. De plus les pavages ne contenant pas de membre récursif ont une structure particulière : ils contiennent toujours un cône de degrés Turing, un degré Turing et tous les degrés qui lui sont supérieurs. Dans un second temps, nous étudions les ensembles de périodes des pavages, pour diverses notions de périodicité, parvenant à des caractérisations à l'aide de classes de complexité ou de calculabilité pour chaque notion étudiée. Enfin nous nous intéressons à la difficulté calculatoire des problèmes de la factorisation et de la conjugaison, des notions de simulation et d'équivalence adaptées aux spécificités des pavages. / This thesis is dedicated to the study of subshifts of finite type (SFTs) : sets of colorings of the discrete plane which respect some local constraints given by a set of forbidden patterns. We study the links between SFTs and computation. SFTs being specific effectively closed classes, we fist study their Turing degree structure, comparing it to the one of effectively closed classes in general: for any effectively closed class, there exist an SFT having the same Turing degrees except maybe 0, the degree of recursive sets. Furthermore, SFTs containing no recursive member have a particular structure: they always contain a cone of Turing degrees, ie. a Turing degree and all degrees above it. We then study the sets of periods of SFTs, for different notions of periodicity, reaching characterizations by means of computational complexity classes or computability classes for each notion introduced. Finally we look at the computable hardness of the factorization and conjugacy problems, the right notions of simulation and equivalence for SFTs. Pavages Sous-shifts Calculabilité Complexité Sft Périodicité Factorisation Degrés Turing Tilings Subshifts Computability Complexity Sft Periodicity Factorization Turing degree
212	Řešení diofantických rovnic rozkladem v číselných tělesech / Solving diophantine equations by factorization in number fields Hrnčiar, Maroš January 2015 (has links) Title: Solving diophantine equations by factorization in number fields Author: Bc. Maroš Hrnčiar Department: Department of Algebra Supervisor: Mgr. Vítězslav Kala, Ph.D., Mathematical Institute, University of Göttingen Abstract: The question of solvability of diophantine equations is one of the oldest mathematical problems in the history of mankind. While different approaches have been developed for solving certain types of equations, this thesis predo- minantly deals with the method of factorization over algebraic number fields. The idea behind this method is to express the equation in the form L = yn where L equals a product of typically linear factors with coefficients in a particular number field. Provided that several assumptions are met, it follows that each of the factors must be the n-th power of an element of the field. The structure of number fields plays a key role in the application of this method, hence a crucial part of the thesis presents an overview of algebraic number theory. In addition to the general theoretical part, the thesis contains all the necessary computations in specific quadratic and cubic number fields describing their basic characteristics. However, the main objective of this thesis is solving specific examples of equati- ons. For instance, in the case of equation x2 + y2 = z3 we...
213	Fatoração de matrizes no problema de coagrupamento com sobreposição de colunas / Matrix factorization for overlapping columns coclustering Brunialti, Lucas Fernandes 31 August 2016 (has links) Coagrupamento é uma estratégia para análise de dados capaz de encontrar grupos de dados, então denominados cogrupos, que são formados considerando subconjuntos diferentes das características descritivas dos dados. Contextos de aplicação caracterizados por apresentar subjetividade, como mineração de texto, são candidatos a serem submetidos à estratégia de coagrupamento; a flexibilidade em associar textos de acordo com características parciais representa um tratamento adequado a tal subjetividade. Um método para implementação de coagrupamento capaz de lidar com esse tipo de dados é a fatoração de matrizes. Nesta dissertação de mestrado são propostas duas estratégias para coagrupamento baseadas em fatoração de matrizes não-negativas, capazes de encontrar cogrupos organizados com sobreposição de colunas em uma matriz de valores reais positivos. As estratégias são apresentadas em termos de suas definições formais e seus algoritmos para implementação. Resultados experimentais quantitativos e qualitativos são fornecidos a partir de problemas baseados em conjuntos de dados sintéticos e em conjuntos de dados reais, sendo esses últimos contextualizados na área de mineração de texto. Os resultados são analisados em termos de quantização do espaço e capacidade de reconstrução, capacidade de agrupamento utilizando as métricas índice de Rand e informação mútua normalizada e geração de informação (interpretabilidade dos modelos). Os resultados confirmam a hipótese de que as estratégias propostas são capazes de descobrir cogrupos com sobreposição de forma natural, e que tal organização de cogrupos fornece informação detalhada, e portanto de valor diferenciado, para as áreas de análise de agrupamento e mineração de texto / Coclustering is a data analysis strategy which is able to discover data clusters, known as coclusters. This technique allows data to be clustered based on different subsets defined by data descriptive features. Application contexts characterized by subjectivity, such as text mining, are candidates for applying coclustering strategy due to the flexibility to associate documents according to partial features. The coclustering method can be implemented by means of matrix factorization, which is suitable to handle this type of data. In this thesis two strategies are proposed in non-negative matrix factorization for coclustering. These strategies are able to find column overlapping coclusters in a given dataset of positive data and are presented in terms of their formal definitions as well as their algorithms\' implementation. Quantitative and qualitative experimental results are presented through applying synthetic datasets and real datasets contextualized in text mining. This is accomplished by analyzing them in terms of space quantization, clustering capabilities and generated information (interpretability of models). The well known external metrics Rand index and normalized mutual information are used to achieve the analysis of clustering capabilities. Results confirm the hypothesis that the proposed strategies are able to discover overlapping coclusters naturally. Moreover, these coclusters produced by the new algorithms provide detailed information and are thus valuable for future research in cluster analysis and text mining Análise de agrupamento Cluster analysis Coagrupamento Coclustering Fatoração de matrizes não-negativas Mineração de texto Non-negative matrix factorization Text mining
214	Predições estatísticas para dados politômicos / Statistical predictions for polytomous data Requena, Guaraci de Lima 17 August 2018 (has links) Este trabalho generaliza a partição da distribuição de Bernoulli multivariada em distribuições de Bernoulli e como esta partição leva a um modelo de regressão e a um classificador para dados politômicos. Como ponto de partida, desejamos explicitar a função de ligação para os modelos de regressão multinomial e escrevê-la a partir de funções de distribuição, como feito no caso binomial, a fim de flexibilizá-la para além da logito usual. Para isso, estudamos as fatorações da Bernoulli multivariada em Bernoullis, bem como a multinomial em binomiais, a fim de explicitar como as funções de distribuição podem desempenhar um papel na ligação entre o espaço das covariáveis e o vetor de probabilidades. Basu & Pereira (1982) exploram tais fatorações em um problema de não resposta e Pereira & Stern (2008) as generalizam para uma classe de fatorações. Este trabalho propõe uma simplificação tanto da regressão multinomial - agregando a flexibilidade do caso binomial -, quanto da classificação politômica, no sentido de decompor o problema politômico em dicotômicos através da generalização da classe de fatorações. Um problema computacional surge pois tal classe pode ter um número muito grande de elementos distintos de acordo com o número de categorias e, assim, duas propostas são feitas para buscar uma que minimiza os riscos de classificação binomial envolvidos, passo-a-passo. A motivação para este trabalho é apresentada a fim de se estudar as performances de tais modelos de regressão e classificadores. Partimos de um problema da área médica, mais especificamente em transtorno obsessivo-compulsivo, em que desejamos classificar um indivíduo a fim de obter um fenótipo mais puro de tal transtorno e de modelá-lo a fim de buscar as covariáveis que estão relacionadas com tal fenótipo, a partir de um conjunto de dados reais. / This work explores a partition of the multivariate Bernoulli distribution in Bernoulli distributions and how this partition leads to a regression model and to a classifier for polytomous data. As starting point, we want to make explicit the link function for multinomial regression models and write it from distribution functions, as in the binomial case, in order to flexibilize it beyond the usual logit. For that, we study the factorizations of the multivariate Bernoulli in Bernoullis, as well as the multinomial in binomials, in order to make explicit as the distribution functions may play a role in the linkage between the space of covariates and the vector of probabilities. Basu and Pereira (1982) explore these factorizations in a nonresponse problem and Pereira and Stern (2008) generalize them to a class of factorizations. Thus, this work proposes a simplification of the multinomial regression - adding the flexibility from the binomial case -, and of the polytomous classification, decomposing de polytomous problem in dichotomous through the generalization of the class of factorizations. At this point, a computational problem arises because the amount of factorizations may be very large according to the number of categories and then we propose two approaches to seek a factorization that minimize the involved binomial classification risks, step-by-step. The motivation for this work is presented in order to study the performance of such regression models and classifiers. We start from a medical problem, more precisely in obsessive-compulsive disorder, in which we want to classify a patient in order to get a more pure phenotype of such disorder and model it in order to seek the related covariates, from a real dataset. Categorical data Classificação Classification Dados categóricos Factorization Fatoração Multinomial regression Obsessive-compulsive disorder Regressão multinomial Transtorno obsessivo-compulsivo
215	Simplificação e análise de redes com dados multivariados / Simplification and analysis of network with multivariate data Dias, Markus Diego Sampaio da Silva 17 October 2018 (has links) As técnicas de visualização desempenham um papel importante na assistência e compreensão de redes e seus elementos. No entanto, quando enfrentamos redes massivas, a análise tende a ser prejudicada pela confusão visual. Esquemas de simplificação e agrupamento têm sido algumas das principais alternativas neste contexto. No entanto, a maioria das técnicas de simplificação consideram apenas informações extraídas da topologia da rede, desconsiderando conteúdo adicional definido nos nós ou arestas da rede. Neste trabalho, propomos dois estudos. Primeiro uma nova metodologia para simplificação de redes que utiliza tanto a topologia quanto o conteúdo associado aos elementos de rede. A metodologia proposta baseia-se na fatoração de matriz não negativa (NMF) e emparelhamento para realizar a simplificação, combinadas para gerar uma representação hierárquica da rede, agrupando elementos semelhantes em cada nível da hierarquia. Propomos também um estudo da utilização da teoria de processamento de sinal em grafos para filtrar os dados associados aos elementos da rede e o seu efeito no processo de simplificação. / Visualization tools play an important role in assisting and understanding networks and their elements. However, when faced with larger networks, analytical tasks can be hindered by visual clutter. Schemes of simplification and clustering have been a main alternative in this context. Nevertheless, most simplification techniques consider only information extracted from the network topology, disregarding additional content defined in nodes or edges. In this paper, we propose two studies. First, a new methodology for network simplification that uses both topology and content associated with network elements. The proposed methodology is based on non-negative matrix factorization (NMF) and graph matching to perform the simplification, combined to generate a hierarchical representation of the network, grouping the most similar elements at each level of a hierarchy. We also provide a study of the use of the graph signal processing theory to filter data associated to the elements of a network and its effect in the process of simplification. Agrupamento Clustering Emparelhamento de grafos Fatoração de matrizes não negativas Graph Matching Network Non-negative matrix factorization Redes Visualização Visualization
216	Predições estatísticas para dados politômicos / Statistical predictions for polytomous data Guaraci de Lima Requena 17 August 2018 (has links) Este trabalho generaliza a partição da distribuição de Bernoulli multivariada em distribuições de Bernoulli e como esta partição leva a um modelo de regressão e a um classificador para dados politômicos. Como ponto de partida, desejamos explicitar a função de ligação para os modelos de regressão multinomial e escrevê-la a partir de funções de distribuição, como feito no caso binomial, a fim de flexibilizá-la para além da logito usual. Para isso, estudamos as fatorações da Bernoulli multivariada em Bernoullis, bem como a multinomial em binomiais, a fim de explicitar como as funções de distribuição podem desempenhar um papel na ligação entre o espaço das covariáveis e o vetor de probabilidades. Basu & Pereira (1982) exploram tais fatorações em um problema de não resposta e Pereira & Stern (2008) as generalizam para uma classe de fatorações. Este trabalho propõe uma simplificação tanto da regressão multinomial - agregando a flexibilidade do caso binomial -, quanto da classificação politômica, no sentido de decompor o problema politômico em dicotômicos através da generalização da classe de fatorações. Um problema computacional surge pois tal classe pode ter um número muito grande de elementos distintos de acordo com o número de categorias e, assim, duas propostas são feitas para buscar uma que minimiza os riscos de classificação binomial envolvidos, passo-a-passo. A motivação para este trabalho é apresentada a fim de se estudar as performances de tais modelos de regressão e classificadores. Partimos de um problema da área médica, mais especificamente em transtorno obsessivo-compulsivo, em que desejamos classificar um indivíduo a fim de obter um fenótipo mais puro de tal transtorno e de modelá-lo a fim de buscar as covariáveis que estão relacionadas com tal fenótipo, a partir de um conjunto de dados reais. / This work explores a partition of the multivariate Bernoulli distribution in Bernoulli distributions and how this partition leads to a regression model and to a classifier for polytomous data. As starting point, we want to make explicit the link function for multinomial regression models and write it from distribution functions, as in the binomial case, in order to flexibilize it beyond the usual logit. For that, we study the factorizations of the multivariate Bernoulli in Bernoullis, as well as the multinomial in binomials, in order to make explicit as the distribution functions may play a role in the linkage between the space of covariates and the vector of probabilities. Basu and Pereira (1982) explore these factorizations in a nonresponse problem and Pereira and Stern (2008) generalize them to a class of factorizations. Thus, this work proposes a simplification of the multinomial regression - adding the flexibility from the binomial case -, and of the polytomous classification, decomposing de polytomous problem in dichotomous through the generalization of the class of factorizations. At this point, a computational problem arises because the amount of factorizations may be very large according to the number of categories and then we propose two approaches to seek a factorization that minimize the involved binomial classification risks, step-by-step. The motivation for this work is presented in order to study the performance of such regression models and classifiers. We start from a medical problem, more precisely in obsessive-compulsive disorder, in which we want to classify a patient in order to get a more pure phenotype of such disorder and model it in order to seek the related covariates, from a real dataset. Classificação Dados categóricos Fatoração Regressão multinomial Transtorno obsessivo-compulsivo Categorical data Classification Factorization Multinomial regression Obsessive-compulsive disorder
217	Fatoração de matrizes no problema de coagrupamento com sobreposição de colunas / Matrix factorization for overlapping columns coclustering Lucas Fernandes Brunialti 31 August 2016 (has links) Coagrupamento é uma estratégia para análise de dados capaz de encontrar grupos de dados, então denominados cogrupos, que são formados considerando subconjuntos diferentes das características descritivas dos dados. Contextos de aplicação caracterizados por apresentar subjetividade, como mineração de texto, são candidatos a serem submetidos à estratégia de coagrupamento; a flexibilidade em associar textos de acordo com características parciais representa um tratamento adequado a tal subjetividade. Um método para implementação de coagrupamento capaz de lidar com esse tipo de dados é a fatoração de matrizes. Nesta dissertação de mestrado são propostas duas estratégias para coagrupamento baseadas em fatoração de matrizes não-negativas, capazes de encontrar cogrupos organizados com sobreposição de colunas em uma matriz de valores reais positivos. As estratégias são apresentadas em termos de suas definições formais e seus algoritmos para implementação. Resultados experimentais quantitativos e qualitativos são fornecidos a partir de problemas baseados em conjuntos de dados sintéticos e em conjuntos de dados reais, sendo esses últimos contextualizados na área de mineração de texto. Os resultados são analisados em termos de quantização do espaço e capacidade de reconstrução, capacidade de agrupamento utilizando as métricas índice de Rand e informação mútua normalizada e geração de informação (interpretabilidade dos modelos). Os resultados confirmam a hipótese de que as estratégias propostas são capazes de descobrir cogrupos com sobreposição de forma natural, e que tal organização de cogrupos fornece informação detalhada, e portanto de valor diferenciado, para as áreas de análise de agrupamento e mineração de texto / Coclustering is a data analysis strategy which is able to discover data clusters, known as coclusters. This technique allows data to be clustered based on different subsets defined by data descriptive features. Application contexts characterized by subjectivity, such as text mining, are candidates for applying coclustering strategy due to the flexibility to associate documents according to partial features. The coclustering method can be implemented by means of matrix factorization, which is suitable to handle this type of data. In this thesis two strategies are proposed in non-negative matrix factorization for coclustering. These strategies are able to find column overlapping coclusters in a given dataset of positive data and are presented in terms of their formal definitions as well as their algorithms\' implementation. Quantitative and qualitative experimental results are presented through applying synthetic datasets and real datasets contextualized in text mining. This is accomplished by analyzing them in terms of space quantization, clustering capabilities and generated information (interpretability of models). The well known external metrics Rand index and normalized mutual information are used to achieve the analysis of clustering capabilities. Results confirm the hypothesis that the proposed strategies are able to discover overlapping coclusters naturally. Moreover, these coclusters produced by the new algorithms provide detailed information and are thus valuable for future research in cluster analysis and text mining Análise de agrupamento Coagrupamento Fatoração de matrizes não-negativas Mineração de texto Cluster analysis Coclustering Non-negative matrix factorization Text mining
218	Data Poisoning Attacks on Linked Data with Graph Regularization January 2019 (has links) abstract: Social media has become the norm of everyone for communication. The usage of social media has increased exponentially in the last decade. The myriads of Social media services such as Facebook, Twitter, Snapchat, and Instagram etc allow people to connect with their friends, and followers freely. The attackers who try to take advantage of this situation has also increased at an exponential rate. Every social media service has its own recommender systems and user profiling algorithms. These algorithms use users current information to make different recommendations. Often the data that is formed from social media services is Linked data as each item/user is usually linked with other users/items. Recommender systems due to their ubiquitous and prominent nature are prone to several forms of attacks. One of the major form of attacks is poisoning the training set data. As recommender systems use current user/item information as the training set to make recommendations, the attacker tries to modify the training set in such a way that the recommender system would benefit the attacker or give incorrect recommendations and hence failing in its basic functionality. Most existing training set attack algorithms work with ``flat" attribute-value data which is typically assumed to be independent and identically distributed (i.i.d.). However, the i.i.d. assumption does not hold for social media data since it is inherently linked as described above. Usage of user-similarity with Graph Regularizer in morphing the training data produces best results to attacker. This thesis proves the same by demonstrating with experiments on Collaborative Filtering with multiple datasets. / Dissertation/Thesis / Masters Thesis Computer Science 2019 Computer science Information science Collaborative filtering Data poisoning attacks Graph laplacian Graph regularization Linked data Matrix factorization
219	Relation Prediction over Biomedical Knowledge Bases for Drug Repositioning Bakal, Mehmet 01 January 2019 (has links) Identifying new potential treatment options for medical conditions that cause human disease burden is a central task of biomedical research. Since all candidate drugs cannot be tested with animal and clinical trials, in vitro approaches are first attempted to identify promising candidates. Likewise, identifying other essential relations (e.g., causation, prevention) between biomedical entities is also critical to understand biomedical processes. Hence, it is crucial to develop automated relation prediction systems that can yield plausible biomedical relations to expedite the discovery process. In this dissertation, we demonstrate three approaches to predict treatment relations between biomedical entities for the drug repositioning task using existing biomedical knowledge bases. Our approaches can be broadly labeled as link prediction or knowledge base completion in computer science literature. Specifically, first we investigate the predictive power of graph paths connecting entities in the publicly available biomedical knowledge base, SemMedDB (the entities and relations constitute a large knowledge graph as a whole). To that end, we build logistic regression models utilizing semantic graph pattern features extracted from the SemMedDB to predict treatment and causative relations in Unified Medical Language System (UMLS) Metathesaurus. Second, we study matrix and tensor factorization algorithms for predicting drug repositioning pairs in repoDB, a general purpose gold standard database of approved and failed drug–disease indications. The idea here is to predict repoDB pairs by approximating the given input matrix/tensor structure where the value of a cell represents the existence of a relation coming from SemMedDB and UMLS knowledge bases. The essential goal is to predict the test pairs that have a blank cell in the input matrix/tensor based on the shared biomedical context among existing non-blank cells. Our final approach involves graph convolutional neural networks where entities and relation types are embedded in a vector space involving neighborhood information. Basically, we minimize an objective function to guide our model to concept/relation embeddings such that distance scores for positive relation pairs are lower than those for the negative ones. Overall, our results demonstrate that recent link prediction methods applied to automatically curated, and hence imprecise, knowledge bases can nevertheless result in high accuracy drug candidate prediction with appropriate configuration of both the methods and datasets used. Biomedical Relation Prediction Machine Learning Computational Drug Repositioning Tensor Factorization Graph Convolutional Neural Networks Artificial Intelligence and Robotics Bioinformatics Biomedical
220	Distributed System for Factorisation of Large Numbers Johansson, Angela January 2004 (has links) <p>This thesis aims at implementing methods for factorisation of large numbers. Seeing that there is no deterministic algorithm for finding the prime factors of a given number, the task proves rather difficult. Luckily, there have been developed some effective probabilistic methods since the invention of the computer so that it is now possible to factor numbers having about 200 decimal digits. This however consumes a large amount of resources and therefore, virtually all new factorisations are achieved using the combined power of many computers in a distributed system. </p><p>The nature of the distributed system can vary. The original goal of the thesis was to develop a client/server system that allows clients to carry out a portion of the overall computations and submit the result to the server. </p><p>Methods for factorisation discussed for implementation in the thesis are: the quadratic sieve, the number field sieve and the elliptic curve method. Actually implemented was only a variant of the quadratic sieve: the multiple polynomial quadratic sieve (MPQS).</p> Informationsteknik factorisation factorization prime factor quadratic sieve QS MPQS number field sieve elliptic curve method Informationsteknik Information technology Informationsteknik

Search results