81 |
Cryptanalysis of Rational Multivariate Public Key CryptosystemsWagner, John G. 06 December 2010 (has links)
No description available.
|
82 |
Improving Predictions with Reliable Extrapolation Schemes and Better Understanding of FactorizationMore, Sushant N. 27 October 2016 (has links)
No description available.
|
83 |
A Study of Machine Learning Approaches for Biomedical Signal ProcessingShen, Minjie 10 June 2021 (has links)
The introduction of high-throughput molecular profiling technologies provides the capability of studying diverse biological systems at molecular level. However, due to various limitations of measurement instruments, data preprocessing is often required in biomedical research. Improper preprocessing will have negative impact on the downstream analytics tasks. This thesis studies two important preprocessing topics: missing value imputation and between-sample normalization.
Missing data is a major issue in quantitative proteomics data analysis. While many methods have been developed for imputing missing values in high-throughput proteomics data, comparative assessment on the accuracy of existing methods remains inconclusive, mainly because the true missing mechanisms are complex and the existing evaluation methodologies are imperfect. Moreover, few studies have provided an outlook of current and future development.
We first report an assessment of eight representative methods collectively targeting three typical missing mechanisms. The selected methods are compared on both realistic simulation and real proteomics datasets, and the performance is evaluated using three quantitative measures. We then discuss fused regularization matrix factorization, a popular low-rank matrix factorization framework with similarity and/or biological regularization, which is extendable to integrating multi-omics data such as gene expressions or clinical variables. We further explore the potential application of convex analysis of mixtures, a biologically inspired latent variable modeling strategy, to missing value imputation. The preliminary results on proteomics data are provided together with an outlook into future development directions.
While a few winners emerged from our comparative assessment, data-driven evaluation of imputation methods is imperfect because performance is evaluated indirectly on artificial missing or masked values not authentic missing values. Imputation accuracy may vary with signal intensity. Fused regularization matrix factorization provides a possibility of incorporating external information. Convex analysis of mixtures presents a biologically plausible new approach.
Data normalization is essential to ensure accurate inference and comparability of gene expressions across samples or conditions. Ideally, gene expressions should be rescaled based on consistently expressed reference genes. However, for normalizing biologically diverse samples, the most commonly used reference genes have exhibited striking expression variability, and distribution-based approaches can be problematic when differentially expressed genes are significantly asymmetric.
We introduce a Cosine score based iterative normalization (Cosbin) strategy to normalize biologically diverse samples. The between-sample normalization is based on iteratively identified consistently expressed genes, where differentially expressed genes are sequentially eliminated according to scale-invariant Cosine scores.
We evaluate the performance of Cosbin and four other representative normalization methods (Total count, TMM/edgeR, DESeq2, DEGES/TCC) on both idealistic and realistic simulation data sets. Cosbin consistently outperforms the other methods across various performance criteria. Implemented in open-source R scripts and applicable to grouped or individual samples, the Cosbin tool will allow biologists to detect subtle yet important molecular signals across known or novel phenotypic groups. / Master of Science / Data preprocessing is often required due to various limitations of measurement instruments in biomedical research. This thesis studies two important preprocessing topics: missing value imputation and between-sample normalization.
Missing data is a major issue in quantitative proteomics data analysis. Imputation is the process of substituting for missing values. We propose a more realistic assessment workflow which can preserve the original data distribution, and then assess eight representative general-purpose imputation strategies. We explore two biologically inspired imputation approaches: fused regularization matrix factorization (FRMF) and convex analysis of mixtures (CAM) imputation. FRMF integrates external information such as clinical variables and multi-omics data into imputation, while CAM imputation incorporates biological assumptions. We show that the integration of biological information improves the imputation performance.
Data normalization is required to ensure correct comparison. For gene expression data, between sample normalization is needed. We propose a Cosine score based iterative normalization (Cosbin) strategy to normalize biologically diverse samples. We show that Cosbin significantly outperform other methods in both ideal simulation and realistic simulation. Implemented in open-source R scripts and applicable to grouped or individual samples, the Cosbin tool will allow biologists to detect subtle yet important molecular signals across known or novel cell types.
|
84 |
On Product and Sum Decompositions of Sets: The Factorization Theory of Power MonoidsAntoniou, Austin A. 10 September 2020 (has links)
No description available.
|
85 |
[en] BINARY MATRIX FACTORIZATION POST-PROCESSING AND APPLICATIONS / [pt] PÓS-PROCESSAMENTO DE FATORAÇÃO BINÁRIA DE MATRIZES E APLICAÇÕESGEORGES MIRANDA SPYRIDES 06 February 2024 (has links)
[pt] Novos métodos de fatoração de matrizes introduzem restrições às matrizes decompostas, permitindo tipos únicos de análise. Uma modificação significativa é a fatoração de matrizes binárias para matrizes binárias. Esta técnica pode revelar subconjuntos comuns e mistura de subconjuntos, tornando-a útil em uma variedade de aplicações, como análise de cesta de mercado, modelagem de tópicos e sistemas de recomendação. Apesar das vantagens, as abordagens atuais enfrentam um trade-off entre precisão, escalabilidade e explicabilidade. Enquanto os métodos baseados em gradiente descendente são escaláveis, eles geram altos erros de reconstrução quando limitados para matrizes binárias. Por outro lado, os métodos heurísticos não são escaláveis. Para superar isso, essa tese propõe um procedimento de pós-processamento para discretizar matrizes obtidas por gradiente descendente. Esta nova abordagem recupera o erro de reconstrução após a limitação e processa com sucesso matrizes maiores dentro de um prazo razoável. Testamos esta técnica a muitas aplicações, incluindo um novo pipeline para descobrir e visualizar padrões em processos petroquímicos em batelada. / [en] Novel methods for matrix factorization introduce constraints to the
decomposed matrices, allowing for unique kinds of analysis. One significant
modification is the binary matrix factorization for binary matrices. This
technique can reveal common subsets and mixing of subsets, making it useful
in a variety of applications, such as market basket analysis, topic modeling,
and recommendation systems. Despite the advantages, current approaches face
a trade-off between accuracy, scalability, and explainability. While gradient
descent-based methods are scalable, they yield high reconstruction errors
when thresholded for binary matrices. Conversely, heuristic methods are not
scalable. To overcome this, this thesis propose a post-processing procedure
for discretizing matrices obtained by gradient descent. This novel approach
recovers the reconstruction error post-thresholding and successfully processes
larger matrices within a reasonable timeframe. We apply this technique to
many applications including a novel pipeline for discovering and visualizing
patterns in petrochemical batch processes.
|
86 |
Integer Factorization on the GPU / Integer Factorization on the GPUPodhorský, Jiří January 2014 (has links)
This work deals with factorization, a decomposition of composite numbers on prime numbers and possibilities of its parallelization. It summarizes also the best known algorithms for factoring and most popular platforms for the implementation of these algorithms on the graphics card. The main part of the thesis deals with the design and implementation of hardware acceleration current fastest algorithm on the graphics card by using the OpenCL framework. Subsequently, the work provides a comparison of speeds accelerated algorithm implemented in this work with other versions of the best known algorithms for factoring, processed serially. In conclusion, the work discussed length of RSA key needed for safe operation without the possibility of breaking in real time interval.
|
87 |
Memory-aware Algorithms and Scheduling Techniques for Matrix Computattions / Algorithmes orientés mémoire et techniques d'ordonnancement pour le calcul matricielHerrmann, Julien 25 November 2015 (has links)
Dans cette thèse, nous nous sommes penchés d’un point de vue à la foisthéorique et pratique sur la conception d’algorithmes et detechniques d’ordonnancement adaptées aux architectures complexes dessuperordinateurs modernes. Nous nous sommes en particulier intéressésà l’utilisation mémoire et la gestion des communications desalgorithmes pour le calcul haute performance (HPC). Nous avonsexploité l’hétérogénéité des superordinateurs modernes pour améliorerles performances du calcul matriciel. Nous avons étudié lapossibilité d’alterner intelligemment des étapes de factorisation LU(plus rapide) et des étapes de factorisation QR (plus stablenumériquement mais plus deux fois plus coûteuses) pour résoudre unsystème linéaire dense. Nous avons amélioré les performances desystèmes d’exécution dynamique à l’aide de pré-calculs statiquesprenants en compte l’ensemble du graphe de tâches de la factorisationCholesky ainsi que l’hétérogénéité de l’architecture. Nous noussommes intéressés à la complexité du problème d’ordonnancement degraphes de tâches utilisant de gros fichiers d’entrée et de sortiesur une architecture hétérogène avec deux types de ressources,utilisant chacune une mémoire spécifique. Nous avons conçu denombreuses heuristiques en temps polynomial pour la résolution deproblèmes généraux que l’on avait prouvés NP-complet aupréalable. Enfin, nous avons conçu des algorithmes optimaux pourordonnancer un graphe de différentiation automatique sur uneplateforme avec deux types de mémoire : une mémoire gratuite maislimitée et une mémoire coûteuse mais illimitée. / Throughout this thesis, we have designed memory-aware algorithms and scheduling techniques suitedfor modern memory architectures. We have shown special interest in improving the performance ofmatrix computations on multiple levels. At a high level, we have introduced new numerical algorithmsfor solving linear systems on large distributed platforms. Most of the time, these linear solvers rely onruntime systems to handle resources allocation and data management. We also focused on improving thedynamic schedulers embedded in these runtime systems by adding static information to their decisionprocess. We proposed new memory-aware dynamic heuristics to schedule workflows, that could beimplemented in such runtime systems.Altogether, we have dealt with multiple state-of-the-art factorization algorithms used to solve linearsystems, like the LU, QR and Cholesky factorizations. We targeted different platforms ranging frommulticore processors to distributed memory clusters, and worked with several reference runtime systemstailored for these architectures, such as P A RSEC and StarPU. On a theoretical side, we took specialcare of modelling convoluted hierarchical memory architectures. We have classified the problems thatare arising when dealing with these storage platforms. We have designed many efficient polynomial-timeheuristics on general problems that had been shown NP-complete beforehand.
|
88 |
Réactions dures exclusives au twist sous-dominant / Hard exclusive processes beyond the leading twistBesse, Adrien 02 July 2013 (has links)
Le sujet de cette thèse sont les amplitudes d'hélicités de la leptoproduction exclusive et diffractive du méson rho dans la limite de Regge perturbative et au-delà du twist dominant. La compréhension de pareils processus exclusifs en termes des constituants élémentaires de QCD est un important défi pour comprendre la structure des hadrons. On présente ici deux nouveaux modèles phénoménologiques basés sur la kT-factorisation des amplitudes d'hélicités en un facteur d'impact γ*(λ) → ρ(λ'), où λ et λ' dénotent les polarisations du photon virtuel et du méson rho, et le facteur d'impact du nucléon cible. Les facteurs d'impacts γ*(λ) → ρ(λ') sont calculés en utilisant la factorisation colinéaire pour séparer la partie molle du méson rho. Le premier modèle est obtenu en combinant les résultats respectivement de twist 2 et twist 3 des facteurs d'impacts où les deux polarisations sont longitudinales ou transverses, avec un modèle pour le facteur d'impact du nucléon et un modèle pour les distributions d'amplitudes du méson rho. Dans la seconde approche présentée dans cette thèse, on calcule ces facteurs d'impacts dans l'espace des paramètres d'impacts et on montre que l'amplitude de diffusion d'un dipôle de couleur avec le nucléon se factorise, permettant de combiner nos résultats avec un modèle de section efficace de dipôle. On obtient en très bon accord avec les données de H1 et ZEUS pour des virtualités plus grandes que quelques GeV. Nous discutons les résultats obtenus et les comparons à d'autres modèles existants. / This thesis deals with the computation of the helicity amplitudes of the exclusive diffractive rho meson leptoproduction in the perturbative Regge limit beyond the leading twist. The understanding of such exclusive processes in terms of the elementary constituents of QCD is a serious challenge to understand the hadronic structure. We present two new phenomenological models based on the kT-factorization of the helicity amplitudes in a γ*(λ) → ρ(λ') impact factor, where λ and λ' denote the polarizations of the virtual photon and the rho meson, and the nucleon target impact factor. The γ*(λ) → ρ(λ') impact factors are then computed using the collinear factorization of the rho meson soft part. The first model relies on the combination of the results respectively up to twist 2 and twist 3 for the impact factors where both polarizations are longitudinal or transverse, with a model for the nucleon impact factor and a model for the distribution amplitudes of the rho meson. In the second approach presented in this thesis, we derive these impact factors in impact parameter space and show that the color dipole scattering amplitude with the nucleon factorizes, allowing to use our results in combination with dipole cross-section models. We get a very good agreement with the data from H1 and ZEUS collaborations for virtualities higher than a few GeV. We discuss our results and compare them to other models.
|
89 |
Stability and stabilization of several classes of fractional systems with delays / Stabilité et stabilisation de diverses classes de systèmes fractionnaires et à retardsNguyen, Le Ha Vy 09 December 2014 (has links)
Nous considérons deux classes de systèmes fractionnaires linéaires invariants dans le temps avec des ordres commensurables et des retards discrets. La première est composée de systèmes fractionnaires à entrées multiples et à une sortie avec des retards en entrées ou en sortie. La seconde se compose de systèmes fractionnaires de type neutre avec retards commensurables. Nous étudions la stabilisation de la première classe de systèmes à l'aide de l'approche de factorisation. Nous obtenons des factorisations copremières à gauche et à droite et les facteurs de Bézout associés: ils permettent de constituer l'ensemble des contrôleurs stabilisants. Pour la deuxième classe de systèmes, nous nous sommes intéressés au cas critique où certaines chaînes de pôles sont asymptotiques à l'axe imaginaire. Tout d'abord, nous réalisons une approximation des pôles asymptotiques afin de déterminer leur emplacement par rapport à l'axe. Le cas échéant, des conditions nécessaires et suffisantes de stabilité H-infini sont données. Cette analyse de stabilité est ensuite étendue aux systèmes à retard classiques ayant la même forme. Enfin, nous proposons une approche unifiée pour les deux classes de systèmes à retards commensurables de type neutre (standards et fractionnaires). Ensuite, la stabilisation d'une sous-classe de systèmes neutres fractionnaires est étudiée. Premièrement, l'ensemble de tous les contrôleurs stabilisants est obtenu. Deuxièmement, nous prouvons que pour une grande classe de contrôleurs fractionnaires à retards il est impossible d'éliminer dans la boucle fermée les chaînes de pôles asymptotiques à l'axe imaginaire si de telles chaînes sont présentes dans les systèmes à contrôler. / We consider two classes of linear time-invariant fractional systems with commensurate orders and discrete delays. The first one consists of multi-input single-output fractional systems with output or input delays. The second one consists of single-input single-output fractional neutral systems with commensurate delays. We study the stabilization of the first class of systems using the factorization approach. We derive left and right coprime factorizations and Bézout factors, which are the elements to constitute the set of all stabilizing controllers. For the second class of systems, we are interested in the critical case where some chains of poles are asymptotic to the imaginary axis. First, we approximate asymptotic poles in order to determine their location relative to the axis. Then, when appropriate, necessary and sufficient conditions for H-infinity-stability are derived. This stability analysis is then extended to classical delay systems of the same form and finally a unified approach for both classes of neutral delay systems with commensurate delays (standard and fractional) is proposed. Next, the stabilization of a subclass of fractional neutral systems is studied. First, the set of all stabilizing controllers is derived. Second, we prove that a large class of fractional controllers with delays cannot eliminate in the closed loop chains of poles asymptotic to the imaginary axis if such chains are present in the controlled systems.
|
90 |
Décomposition booléenne des tableaux multi-dimensionnels de données binaires : une approche par modèle de mélange post non-linéaire / Boolean decomposition of binary multidimensional arrays using a post nonlinear mixture modelDiop, Mamadou 14 December 2018 (has links)
Cette thèse aborde le problème de la décomposition booléenne des tableaux multidimensionnels de données binaires par modèle de mélange post non-linéaire. Dans la première partie, nous introduisons une nouvelle approche pour la factorisation booléenne en matrices binaires (FBMB) fondée sur un modèle de mélange post non-linéaire. Contrairement aux autres méthodes de factorisation de matrices binaires existantes, fondées sur le produit matriciel classique, le modèle proposé est équivalent au modèle booléen de factorisation matricielle lorsque les entrées des facteurs sont exactement binaires et donne des résultats plus interprétables dans le cas de sources binaires corrélées, et des rangs d'approximation matricielle plus faibles. Une condition nécessaire et suffisante d'unicité pour la FBMB est également fournie. Deux algorithmes s'appuyant sur une mise à jour multiplicative sont proposés et illustrés dans des simulations numériques ainsi que sur un jeu de données réelles. La généralisation de cette approche au cas de tableaux multidimensionnels (tenseurs) binaires conduit à la factorisation booléenne de tenseurs binaires (FBTB). La démonstration de la condition nécessaire et suffisante d’unicité de la décomposition booléenne de tenseurs binaires repose sur la notion d'indépendance booléenne d'une famille de vecteurs. L'algorithme multiplicatif fondé sur le modèle de mélange post non-linéaire est étendu au cas multidimensionnel. Nous proposons également un nouvel algorithme, plus efficace, s'appuyant sur une stratégie de type AO-ADMM (Alternating Optimization -ADMM). Ces algorithmes sont comparés à ceux de l'état de l'art sur des données simulées et sur un jeu de données réelles / This work is dedicated to the study of boolean decompositions of binary multidimensional arrays using a post nonlinear mixture model. In the first part, we introduce a new approach for the boolean factorization of binary matrices (BFBM) based on a post nonlinear mixture model. Unlike the existing binary matrix factorization methods, the proposed method is equivalent to the boolean factorization model when the matrices are strictly binary and give thus more interpretable results in the case of correlated sources and lower rank matrix approximations compared to other state-of-the-art algorithms. A necessary and suffi-cient condition for the uniqueness of the BFBM is also provided. Two algorithms based on multiplicative update rules are proposed and tested in numerical simulations, as well as on a real dataset. The gener-alization of this approach to the case of binary multidimensional arrays (tensors) leads to the boolean factorisation of binary tensors (BFBT). The proof of the necessary and sufficient condition for the boolean decomposition of binary tensors is based on a notion of boolean independence of binary vectors. The multiplicative algorithm based on the post nonlinear mixture model is extended to the multidimensional case. We also propose a new algorithm based on an AO-ADMM (Alternating Optimization-ADMM) strategy. These algorithms are compared to state-of-the-art algorithms on simulated and on real data
|
Page generated in 0.0797 seconds