Spelling suggestions: "subject:"canonical polyaddition decomposition"" "subject:"anonical polyaddition decomposition""
1 |
Optimisation déterministe et stochastique pour des problèmes de traitement d'images en grande dimension / Deterministic and stochastic optimization for solving large size inverse problems in image processingVu, Thi Thanh Xuan 13 November 2017 (has links)
Dans cette thèse on s’intéresse au problème des décompositions canoniques polyadiques de tenseurs d’ordre $N$ potentiellement grands et sous différentes contraintes (non-négativité, aspect creux lié à une possible surestimation du rang du tenseur). Pour traiter ce problème, nous proposons trois nouvelles approches itératives différentes: deux approches déterministes dont une approche proximale, et une approche stochastique. La première approche étend les travaux de thèse de J-P. Royer au cas de tenseurs de dimension $N$. Dans l’approche stochastique, nous considérons pour la première fois dans le domaine des décompositions tensorielles, des algorithmes génétiques (mimétiques) dont principe général repose sur l’évolution d’une population de candidats. Dans le dernier type d’approche, nous avons considéré un algorithme proximal pré-conditionné (le Block-Coordinate Variable Metric Forward-Backward), algorithme fonctionnant par blocs de données avec une matrice de pré-conditionnement liée à chaque bloc et fondé sur deux étapes successives principales : une étape de gradient et une étape proximale. Finalement, les différentes méthodes suggérées sont comparées entre elles et avec d’autres algorithmes classiques de la littérature sur des données synthétiques (à la fois aléatoires ou proches des données observées en spectroscopie de fluorescence) et sur des données expérimentales réelles correspondant à une campagne de surveillance des eaux d’une rivière et visant à la détection d’apparition de polluants. / In this PhD thesis, we consider the problem of the Canonical Polyadic Decomposition (CPD) of potentially large $N$-th order tensors under different constraints (non-negativity, sparsity due to a possible overestimation of the tensor rank, etc.). To tackle such a problem, we propose three new iterative methods: a standard gradient based deterministic approach, a stochastic approach (memetic) and finally a proximal approach (Block-Coordinate Variable Metric Forward-Backward). The first approach extends J-P. Royer's works to the case of non-negative N-th order tensors. In the stochastic approach, genetic (memetic) methods are considered for the first time to solve the CPD problem. Their general principle is based on the evolution of a family of candidates. In the third type of approaches, a proximal algorithm namely the Block-Coordinate Variable Metric Forward-Backward is presented. The algorithm relies on two main steps: a gradient step and a proximal step. The blocks of coordinates naturally correspond to latent matrices. We propose a majorant function as well as a preconditioner with regard to each block. All methods are compared with other popular algorithms of the literature on synthetic (fluorescence spectroscopy like or random) data and on real experimental data corresponding to a water monitoring campaign aiming at detecting the appearance of pollutants.
|
2 |
Breaking the curse of dimensionality in electronic structure methods: towards optimal utilization of the canonical polyadic decompositionPierce, Karl Martin 27 January 2022 (has links)
Despite the fact that higher-order tensors (HOTs) plague electronic structure methods and severely limits the modeling of interesting chemistry problems, introduction and application of higher-order tensor (HOT) decompositions, specifically the canonical polyadic (CP) decomposition, is fairly limited. The CP decomposition is an incredibly useful sparse tensor factorization that has the ability to disentangle all correlated modes of a tensor. However the complexities associated with CP decomposition have made its application in electronic structure methods difficult.
Some of the major issues related to CP decomposition are a product of the mathematics of computing the decomposition: determining the exact CP rank is a non-polynomially hard problem, finding stationary points for rank-R approximations require non-linear optimization techniques, and inexact CP approximations can introduce a large degree of error into tensor networks. While other issues are a result of the construction of computer architectures. For example, computer processing units (CPUs) are organized in a way to maximize the efficiency of dense linear algebra and, thus, the performance of routine tensor algebra kernels, like the Khatri-Rao product, is limited. In this work, we seek to reduce the complexities associated with the CP decomposition and create a route for others to develop reduced-scaling electronic structure theory methods using the CP decomposition.
In Chapter 2, we introduce the robust tensor network approximation. This approximation is a way to, in general, eliminate the leading-order error associated with approximated tensors in a network. We utilize the robust network approximation to significantly increase the accuracy of approximating density fitting (DF) integral tensors using rank-deficient CP decompositions in the particle-particle ladder (PPL) diagram of the coupled cluster method with single and double substitutions (CCSD). We show that one can produce results with negligible error in chemically relevant energy differences using a CP rank roughly the same size as the DF fitting basis; which is a significantly smaller rank requirement than found using either a nonrobust approximation or similar grid initialized CP approximations (the pseudospectral (PS) and tensor hypercontraction (THC) approximations). Introduction of the CP approximation, formally, reduces the complexity of the PPL diagram from 𝓞(N⁶) to 𝓞(N⁵) and, using the robust approximation, we are able to observe a cost reduction in CCSD calculations for systems as small as a single water molecule.
In Chapter 3, we further demonstrate the utility of the robust network approximation and, in addition, we construct a scheme to optimize a grid-free CP decomposition of the order-four Coulomb integral tensor in 𝓞(N⁴) time. Using these ideas, we reduce the complexity of ten bottleneck contractions from 𝓞(N⁶) to 𝓞(N⁵) in the Laplace transform (LT) formulation of the perturbative triple, (T), correction to CCSD. We show that introducing CP into the LT (T) method with a CP rank roughly the size of the DF fitting basis reduces the cost of computing medium size molecules by a factor of about 2.5 and introduces negligible error into chemically relevant energy differences. Furthermore, we implement these low-cost algorithms using newly developed, optimized tensor algebra kernels in the massively-parallel, block-sparse TiledArray [Calvin, et. al Chemical Reviews 2021 121 (3), 1203-1231] tensor framework. / Doctor of Philosophy / Electronic structure methods and accurate modeling of quantum chemistry have developed alongside the advancements in computer infrastructures. Increasingly large and efficient computers have allowed researchers to model remarkably large chemical systems. Sadly, for as fast as computer infrastructures grow (Moores law predicts that the number of transistors in a computer will double every 18 months) the cost of electronic structure methods grows more quickly. One of the least expensive electronic structure methods, Hartree Fock (HF), grows quartically with molecular size; this means that doubling the size of a molecule increase the number of computer operations by a factor of 16. However, it is known that when chemical systems become sufficiently large, the amount of physical information added to the system grows linearly with system size.[Goedecker, et. al. Comput. Sci. Eng., 2003, 5, (4), 14-21] Unfortunately, standard implementations of electronic structure methods will never achieve linear scaling; the disparity between actual cost and physical scaling of molecules is a result of storing and manipulating data using dense tensors and is known as the curse of dimensionality.[Bellman, Adaptive Control Processes, 1961, 2045, 276]
Electronic structure theorists, in their desire to apply accurate methods to increasingly large systems, have known for some time that the cost of conventional algorithms is unreasonably high. These theorists have found that one can reveal sparsity and develop reduced-complexity algorithms using matrix decomposition techniques. However, higher-order tensors (HOTs), tensors with more than two modes, are routinely necessary in algorithm formulations. Matrix decompositions applied to HOTs are not necessarily straight-forward and can have no effect on the limiting behavior of an algorithm. For example, because of the positive definiteness of the Coulomb integral tensor, it is possible to perform a Cholesky decomposition (CD) to reduce the complexity of tensor from an order-4 tensor to a product of order-3 tensors.[Beebe, et. al. Int. J. Quantum Chem., 1977, 12, 683-705] However, using the CD approximated Coulomb integral tensors it is not possible to reduce the complexity of popular methods such as Hartree-Fock or coupled cluster theory.
We believe that the next step to reducing the complexity of electronic structure methods is through the accurate application of HOT decompositions. In this work, we only consider a single HOT decomposition: the canonical polyadic (CP) decomposition which represents a tensor as a polyadic sum of products. The CP decomposition disentangles all modes of a tensor by representing an order-N tensor as N order-2 tensors. In this work, we construct the CP decomposition of tensors using algebraic optimization. Our goal, here, is to tackle one of the biggest issues associated with the CP decomposition: accurately approximating tensors and tensor networks. In Chapter 2, we develop a robust formulation to approximate tensor networks, a formulation which removes the leading-order error associated with tensor approximations in a network.[Pierce, et. al. J. Chem. Theory Comput., 2021 17 (4), 2217- 2230] We apply a robust CP approximation to the coupled cluster method with single and double substitutions (CCSD) to reduce the overall cost of the approach. Using this robust CP approximation we can compute CCSD, on average, 2.5-3 times faster and introduce negligibly small error in chemically relevant energy values. Furthermore in Chapter 3, we again use the robust CP network approximation in conjunction with a novel, low cost approach to compute order-four CP decompositions, to reduce the cost of 10 high cost computations in the the perturbative triple, (T), correction to CCSD. By removing these computations, we are able to reduce the cost of (T) by a factor of about 2.5 while introducing significantly small error.
|
3 |
Algorithmes de diagonalisation conjointe par similitude pour la décomposition canonique polyadique de tenseurs : applications en séparation de sources / Joint diagonalization by similarity algorithms for the canonical polyadic decomposition of tensors : Applications in blind source separationAndré, Rémi 07 September 2018 (has links)
Cette thèse présente de nouveaux algorithmes de diagonalisation conjointe par similitude. Cesalgorithmes permettent, entre autres, de résoudre le problème de décomposition canonique polyadiquede tenseurs. Cette décomposition est particulièrement utilisée dans les problèmes deséparation de sources. L’utilisation de la diagonalisation conjointe par similitude permet de paliercertains problèmes dont les autres types de méthode de décomposition canonique polyadiquesouffrent, tels que le taux de convergence, la sensibilité à la surestimation du nombre de facteurset la sensibilité aux facteurs corrélés. Les algorithmes de diagonalisation conjointe par similitudetraitant des données complexes donnent soit de bons résultats lorsque le niveau de bruit est faible,soit sont plus robustes au bruit mais ont un coût calcul élevé. Nous proposons donc en premierlieu des algorithmes de diagonalisation conjointe par similitude traitant les données réelles etcomplexes de la même manière. Par ailleurs, dans plusieurs applications, les matrices facteursde la décomposition canonique polyadique contiennent des éléments exclusivement non-négatifs.Prendre en compte cette contrainte de non-négativité permet de rendre les algorithmes de décompositioncanonique polyadique plus robustes à la surestimation du nombre de facteurs ou lorsqueces derniers ont un haut degré de corrélation. Nous proposons donc aussi des algorithmes dediagonalisation conjointe par similitude exploitant cette contrainte. Les simulations numériquesproposées montrent que le premier type d’algorithmes développés améliore l’estimation des paramètresinconnus et diminue le coût de calcul. Les simulations numériques montrent aussi queles algorithmes avec contrainte de non-négativité améliorent l’estimation des matrices facteurslorsque leurs colonnes ont un haut degré de corrélation. Enfin, nos résultats sont validés à traversdeux applications de séparation de sources en télécommunications numériques et en spectroscopiede fluorescence. / This thesis introduces new joint eigenvalue decomposition algorithms. These algorithms allowamongst others to solve the canonical polyadic decomposition problem. This decomposition iswidely used for blind source separation. Using the joint eigenvalue decomposition to solve thecanonical polyadic decomposition problem allows to avoid some problems whose the others canonicalpolyadic decomposition algorithms generally suffer, such as the convergence rate, theoverfactoring sensibility and the correlated factors sensibility. The joint eigenvalue decompositionalgorithms dealing with complex data give either good results when the noise power is low, orthey are robust to the noise power but have a high numerical cost. Therefore, we first proposealgorithms equally dealing with real and complex. Moreover, in some applications, factor matricesof the canonical polyadic decomposition contain only nonnegative values. Taking this constraintinto account makes the algorithms more robust to the overfactoring and to the correlated factors.Therefore, we also offer joint eigenvalue decomposition algorithms taking advantage of thisnonnegativity constraint. Suggested numerical simulations show that the first developed algorithmsimprove the estimation accuracy and reduce the numerical cost in the case of complexdata. Our numerical simulations also highlight the fact that our nonnegative joint eigenvaluedecomposition algorithms improve the factor matrices estimation when their columns have ahigh correlation degree. Eventually, we successfully applied our algorithms to two blind sourceseparation problems : one concerning numerical telecommunications and the other concerningfluorescence spectroscopy.
|
4 |
Autoregressive Tensor Decomposition for NYC Taxi Data AnalysisZongwei Li (9192548) 31 July 2020 (has links)
Cities have adopted evolving urban digitization strategies, and most of those increasingly focus on data, especially in the field of public transportation. Transportation data have intuitively spatial and temporal characteristics, for they are often described with when and where the trips occur. Since a trip is often described with many attributes, the transportation data can be presented with a tensor, a container which can house data in $N$-dimensions. Unlike a traditional data frame, which only has column variables, tensor is intuitively more straightforward to explore spatio-temporal data-sets, which makes those attributes more easily interpreted. However, it requires unique techniques to extract useful and relatively correct information in attributes highly correlated with each other. This work presents a mixed model consisting of tensor decomposition combined with seasonal vector autoregression in time to find latent patterns within historical taxi data classified by types of taxis, pick-up and drop-off times of services in NYC, so that it can help predict the place and time where taxis are demanded. We validated the proposed approach using the experiment evaluation with real NYC tax data. The proposed method shows the best prediction among alternative models without geographical inference, and captures the daily patterns of taxi demands for business and entertainment needs.
|
5 |
Estimation de modèles tensoriels structurés et récupération de tenseurs de rang faible / Estimation of structured tensor models and recovery of low-rank tensorsGoulart, José Henrique De Morais 15 December 2016 (has links)
Dans la première partie de cette thèse, on formule deux méthodes pour le calcul d'une décomposition polyadique canonique avec facteurs matriciels linéairement structurés (tels que des facteurs de Toeplitz ou en bande): un algorithme de moindres carrés alternés contraint (CALS) et une solution algébrique dans le cas où tous les facteurs sont circulants. Des versions exacte et approchée de la première méthode sont étudiées. La deuxième méthode fait appel à la transformée de Fourier multidimensionnelle du tenseur considéré, ce qui conduit à la résolution d'un système d'équations monomiales homogènes. Nos simulations montrent que la combinaison de ces approches fournit un estimateur statistiquement efficace, ce qui reste vrai pour d'autres combinaisons de CALS dans des scénarios impliquant des facteurs non-circulants. La seconde partie de la thèse porte sur la récupération de tenseurs de rang faible et, en particulier, sur le problème de reconstruction tensorielle (TC). On propose un algorithme efficace, noté SeMPIHT, qui emploie des projections séquentiellement optimales par mode comme opérateur de seuillage dur. Une borne de performance est dérivée sous des conditions d'isométrie restreinte habituelles, ce qui fournit des bornes d'échantillonnage sous-optimales. Cependant, nos simulations suggèrent que SeMPIHT obéit à des bornes optimales pour des mesures Gaussiennes. Des heuristiques de sélection du pas et d'augmentation graduelle du rang sont aussi élaborées dans le but d'améliorer sa performance. On propose aussi un schéma d'imputation pour TC basé sur un seuillage doux du coeur du modèle de Tucker et son utilité est illustrée avec des données réelles de trafic routier / In the first part of this thesis, we formulate two methods for computing a canonical polyadic decomposition having linearly structured matrix factors (such as, e.g., Toeplitz or banded factors): a general constrained alternating least squares (CALS) algorithm and an algebraic solution for the case where all factors are circulant. Exact and approximate versions of the former method are studied. The latter method relies on a multidimensional discrete-time Fourier transform of the target tensor, which leads to a system of homogeneous monomial equations whose resolution provides the desired circulant factors. Our simulations show that combining these approaches yields a statistically efficient estimator, which is also true for other combinations of CALS in scenarios involving non-circulant factors. The second part of the thesis concerns low-rank tensor recovery (LRTR) and, in particular, the tensor completion (TC) problem. We propose an efficient algorithm, called SeMPIHT, employing sequentially optimal modal projections as its hard thresholding operator. Then, a performance bound is derived under usual restricted isometry conditions, which however yield suboptimal sampling bounds. Yet, our simulations suggest SeMPIHT obeys optimal sampling bounds for Gaussian measurements. Step size selection and gradual rank increase heuristics are also elaborated in order to improve performance. We also devise an imputation scheme for TC based on soft thresholding of a Tucker model core and illustrate its utility in completing real-world road traffic data acquired by an intelligent transportation
|
Page generated in 0.137 seconds