Global ETD Search

21	Problems on Non-Equilibrium Statistical Physics Kim, Moochan 2010 May 1900 (has links) Four problems in non-equilibrium statistical physics are investigated: 1. The thermodynamics of single-photon gas; 2. Energy of the ground state in Multi-electron atoms; 3. Energy state of the H2 molecule; and 4. The Condensation behavior in N weakly interacting Boson gas. In the single-photon heat engine, we have derived the equation of state similar to that in classical ideal gas and applied it to construct the Carnot cycle with a single photon, and showed the Carnot efficiency in this single-photon heat engine. The energies of the ground state of multi-electron atoms are calculated using the modi ed Bohr model with a shell structure of the bound electrons. The di erential Schrodinger equation is simpli ed into the minimization problem of a simple energy functional, similar to the problem in dimensional scaling in the H-atom. For the C-atom, we got the ground state energy -37:82 eV with a relative error less than 6 %. The simplest molecular ion, H+ 2 , has been investigated by the quasi-classical method and two-center molecular orbit. Using the two-center molecular orbit derived from the exact treatment of the H+ 2 molecular ion problem, we can reduce the number of terms in wavefunction to get the binding energy of the H2 molecule, without using the conventional wavefunction with over-thousand terms. We get the binding energy for the H2 with Hylleraas correlation factor 1 + kr12 as 4:7eV, which is comparable to the experimental value of 4:74 eV. Condensation in the ground state of a weakly interacting Bose gas in equilibrium is investigated using a partial partition function in canonical ensemble. The recursive relation for the partition function developed for an ideal gas has been modi ed to be applicable in the interacting case, and the statistics of the occupation number in condensate states was examined. The well-known behavior of the Bose-Einstein Condensate for a weakly interacting Bose Gas are shown: Depletion of the condensate state, even at zero temperature, and a maximum uctuation near transition temperature. Furthermore, the use of the partition function in canonical ensemble leads to the smooth cross-over between low temperatures and higher temperatures, which has enlarged the applicable range of the Bogoliubov transformation. During the calculation, we also developed the formula to calculate the correlations among the excited states. single photon heat engine multi-electron atoms modified Bohr model two-center molecular orbit Bose-Einstein Condenation partial partition function
22	Výpočet standartních termodynamických funkcí jednoduchých sloučenin v podmínkách termálního plazmatu / Calculation of Standard Thermodynamic Functions of Simple Compounds under Thermal Plasma Conditions Živný, Oldřich January 2011 (has links) The substance of present work is to provide standard thermodynamic functions (STF) of small size molecules for the calculation of the composition and thermodynamic properties of low-temperature plasma, and also method for such a calculation applying obtained STF under non-ideal plasma conditions. With a view to further application in modelling the phenomena in thermal plasma the range of pressures is limited to the region from 0.01 bar to 100 bar, and that of temperature to 298.15–50 kK. To obtain STF the method of partition function resulting from statistical mechanics was proposed. State of art in the given scientific area and theoretical basis of the statistical mechanics required for establishing of the proposed method together with discussion of partition function divergence problem have been reviewed. For the calculation of STF of diatomic molecules the method of direct summation has been employed, whereas, as for the larger size molecules, the rigid rotor and harmonic oscillator model have been generally adopted. The spectral data required for the calculations have been taken from literature, or, in selected cases, these have been computed by quantum chemistry ab initio techniques. The resulting STF have been included into already existing database system of thermodynamic properties and those can serve as input data for subsequent thermodynamic calculations. A general method has been worked out for the purpose of the computation of thermodynamic properties and composition of non-ideal homogenous plasma system in thermodynamic equilibrium. The method is based on minimizing total Gibbs energy to compute at constant pressure or Helmholtz energy to compute at constant volume. The computation algorithm was implemented into computer program and subsequently applied to the computation of the composition and thermodynamic properties of SF6 dissociation and ionization products using obtained STF.
23	Zeros de Fisher e aspectos críticos do modelo de Ising dipolar / Fisher\'s zeros and critical aspects of the dipolar Ising model Fonseca, Jacyana Saraiva Marthes 06 June 2011 (has links) Estudamos o comportamento crítico do modelo de Ising com interação dipolar, em redes bidimensionais regulares. Este modelo apresenta um cenário fenomenologicamente rico devido ao efeito de frustração causado pela competição entre as interações de troca do Ising puro e a interação dipolar. A criticalidade do modelo foi estudada a partir das relações de escala de tamanho finito para os zeros da função de partição no plano complexo da temperatura. Esta abordagem nunca foi utilizada no estudo do modelo em questão. Nosso estudo se baseia em simulações de Monte Carlo usando o algoritmo multicanônico. O objetivo deste trabalho é obter a temperatura crítica em função do acoplamento (razão entre as intensidades dos acoplamentos ferromagnético e dipolar) e construir uma parte do diagrama de fase do modelo. Diferentes partes do diagrama de fase ainda não apresentam indicações conclusivas a respeito da ordem das linhas de transição. Em particular, há evidências na literatura de um ponto tricrítico para no intervalo [0.90,1.00], mas sua localização precisa não é conhecida. Nossas simulações indicam que o ponto tricrítico não se localiza no intervalo acima. Nossos resultados mostraram que, para [0.89,1.10], a fase do tipo faixas com h=1 passa para a fase tetragonal através de uma transição de segunda ordem. A análise de FSS para os zeros da função de partição na variável temperatura, apresenta, para =1.20, uma transição de fase de segunda ordem e para =1.30, uma transição de fase de primeira ordem. Dessa forma, o ponto tricrítico ocorre somente entre =1.20 e 1.30. Realizamos um estudo complementar baseado na abordagem microcanônica e observamos duas transições de fase de segunda ordem para =1.20 e duas transições de fase de primeira ordem para =1.30, que indica a presença da fase nemática intermediária. / We study the critical behavior of the dipolar Ising model on two-dimensional regular lattices. This model presents a phenomenologically rich scenario due to the effect of frustration caused by the competition between the pure Ising interaction and the dipolar one. To study the criticality of this model we apply finite size scaling relations for the partition function zeros in the complex temperature plane. The partition function zeros analysis has never been used before to study such model with long-range interactions. Our study relies on Monte Carlo simulations using the multicanonical algorithm. Our goal is to obtain the critical temperature as a function of the coupling (the ratio between the ferromagnetic and dipolar couplings) to construct a part of the phase diagram. Different parts of the phase diagram do not present a conclusive results about the order of the phase transition lines.In particular, there is evidence of a tricritical point for [0.90,1.00], but its precise location is unknown. Our simulations indicate that the tricritical point is not located in the above range. Our FSS analysis show that for =1.20 the striped-tetragonal transition is a second-order phase transition and for =1.30 it is a first-order one. Thus, the tricritical point must occur between =1.2 and =1.3. We have used a microcanonical approach to study the criticality of this model too. This approach indicates two second-order phase transitions for =1.20 and two first-order phase transitions for =1.30. Therefore, it presents evidences for the presence of an intermediate nematic phase. algoritmo multicanônico complex zeros of the partition function dipolar Ising model modelo de Ising dipolar multicanonical algorithm phase transitions transições de fase
24	Zeros de Fisher e aspectos críticos do modelo de Ising dipolar / Fisher\'s zeros and critical aspects of the dipolar Ising model Jacyana Saraiva Marthes Fonseca 06 June 2011 (has links) Estudamos o comportamento crítico do modelo de Ising com interação dipolar, em redes bidimensionais regulares. Este modelo apresenta um cenário fenomenologicamente rico devido ao efeito de frustração causado pela competição entre as interações de troca do Ising puro e a interação dipolar. A criticalidade do modelo foi estudada a partir das relações de escala de tamanho finito para os zeros da função de partição no plano complexo da temperatura. Esta abordagem nunca foi utilizada no estudo do modelo em questão. Nosso estudo se baseia em simulações de Monte Carlo usando o algoritmo multicanônico. O objetivo deste trabalho é obter a temperatura crítica em função do acoplamento (razão entre as intensidades dos acoplamentos ferromagnético e dipolar) e construir uma parte do diagrama de fase do modelo. Diferentes partes do diagrama de fase ainda não apresentam indicações conclusivas a respeito da ordem das linhas de transição. Em particular, há evidências na literatura de um ponto tricrítico para no intervalo [0.90,1.00], mas sua localização precisa não é conhecida. Nossas simulações indicam que o ponto tricrítico não se localiza no intervalo acima. Nossos resultados mostraram que, para [0.89,1.10], a fase do tipo faixas com h=1 passa para a fase tetragonal através de uma transição de segunda ordem. A análise de FSS para os zeros da função de partição na variável temperatura, apresenta, para =1.20, uma transição de fase de segunda ordem e para =1.30, uma transição de fase de primeira ordem. Dessa forma, o ponto tricrítico ocorre somente entre =1.20 e 1.30. Realizamos um estudo complementar baseado na abordagem microcanônica e observamos duas transições de fase de segunda ordem para =1.20 e duas transições de fase de primeira ordem para =1.30, que indica a presença da fase nemática intermediária. / We study the critical behavior of the dipolar Ising model on two-dimensional regular lattices. This model presents a phenomenologically rich scenario due to the effect of frustration caused by the competition between the pure Ising interaction and the dipolar one. To study the criticality of this model we apply finite size scaling relations for the partition function zeros in the complex temperature plane. The partition function zeros analysis has never been used before to study such model with long-range interactions. Our study relies on Monte Carlo simulations using the multicanonical algorithm. Our goal is to obtain the critical temperature as a function of the coupling (the ratio between the ferromagnetic and dipolar couplings) to construct a part of the phase diagram. Different parts of the phase diagram do not present a conclusive results about the order of the phase transition lines.In particular, there is evidence of a tricritical point for [0.90,1.00], but its precise location is unknown. Our simulations indicate that the tricritical point is not located in the above range. Our FSS analysis show that for =1.20 the striped-tetragonal transition is a second-order phase transition and for =1.30 it is a first-order one. Thus, the tricritical point must occur between =1.2 and =1.3. We have used a microcanonical approach to study the criticality of this model too. This approach indicates two second-order phase transitions for =1.20 and two first-order phase transitions for =1.30. Therefore, it presents evidences for the presence of an intermediate nematic phase. algoritmo multicanônico modelo de Ising dipolar transições de fase complex zeros of the partition function dipolar Ising model multicanonical algorithm phase transitions
25	IP Algorithm Applied to Proteomics Data Green, Christopher Lee 30 November 2004 (has links) (PDF) Mass spectrometry has been used extensively in recent years as a valuable tool in the study of proteomics. However, the data thus produced exhibits hyper-dimensionality. Reducing the dimensionality of the data often requires the imposition of many assumptions which can be harmful to subsequent analysis. The IP algorithm is a dimension reduction algorithm, similar in purpose to latent variable analysis. It is based on the principle of maximum entropy and therefore imposes a minimum number of assumptions on the data. Partial Least Squares (PLS) is an algorithm commonly used with proteomics data from mass spectrometry in order to reduce the dimension of the data. The IP algorithm and a PLS algorithm were applied to proteomics data from mass spectrometry to reduce the dimension of the data. The data came from three groups of patients, those with no tumors, malignant or benign tumors. Reduced data sets were produced from the IP algorithm and the PLS algorithm. Logistic regression models were constructed using predictor variables extracted from these data sets. The response was threefold and indicated which tumor classifications each patient belonged. Misclassification rates were determined for the IP algorithm and the PLS algorithm. The rates correct classification associated with the IP algorithm were equal or better than those rates associated with the PLS algorithm. Proteomics hyper-dimensionality Partial Least Squares mass spectrometry IP algorithm Information Partition Function classification discrimination ovarian cancer grade of membership Statistics and Probability
26	Mixed-integer programming representation for symmetrical partition function form games Pepin, Justine 11 1900 (has links) In contexts involving multiple agents (players), determining how they can cooperate through the formation of coalitions and how they can share surplus benefits coming from the collaboration is crucial. This can provide decision-aid to players and analysis tools for policy makers regulating economic markets. Such settings belong to the field of cooperative game theory. A critical element in this area has been the size of the representation of these games: for each possible partition of players, the value of each coalition on it must be provided. Symmetric partition function form games (SPFGs) belong to a class of cooperative games with two important characteristics. First, they account for externalities provoked by any group of players joining forces or splitting into subsets on the remaining coalitions of players. Second, they consider that players are indistinct, meaning that only the number of players in each coalition is relevant for the SPFG. Using mixed-integer programming, we present the first representation of SPFGs that is polynomial on the number of players in the game. We also characterize the family of SPFGs that we can represent. In particular, the representation is able to encode exactly all SPFGs with five players or less. Furthermore, we provide a compact representation approximating SPFGs when there are six players or more and the SPFG cannot be represented exactly. We also introduce a flexible framework that uses stability methods inspired from the literature to identify a stable social-welfare maximizing game outcome using our representation. We showcase the value of our compact (approximated) representation and approach to determine a stable partition and payoff allocation to a competitive market from the literature. / Dans tout contexte impliquant plusieurs agents (joueurs), il est impératif de déterminer comment les agents coopéreront par la formation de coalitions et comment ils partageront les bénéfices supplémentaires issus de la collaboration. Ceci peut fournir une aide à la décision aux joueurs, ou encore des outils d'analyse pour les responsables en charge de réguler les marchés économiques. De telles situations relèvent de la théorie des jeux coopérative. Un élément crucial de ce domaine est la taille de la représentation de ces jeux : pour chaque partition de joueurs possible, la valeur de chaque coalition qu'on y retrouve doit être donnée. Les jeux symétriques à fonction de partition (SPFG) appartiennent à une classe de jeux coopératifs possédant deux caractéristiques principales. Premièrement, ils sont sensibles aux externalités, provoquées par n'importe quel groupe de joueurs qui s'allient ou défont leurs alliances, qui sont ressenties par les autres coalitions de joueurs. Deuxièmement, ils considèrent que les joueurs sont indistincts, et donc que seul le nombre de joueurs dans chaque coalition est à retenir pour représenter un SPFG. Par l'utilisation d'outils de programmation mixte en nombres entiers, nous présentons la première représentation de SPFG qui est polynomiale en nombre de joueurs dans le jeu. De surcroît, nous caractérisons la famille des SPFG qu'il est possible de représenter, qui inclut notamment tous les SPFG de cinq joueurs ou moins. De plus, elle dispose d'une approximation compacte pour le cas où, dans un jeu à six joueurs ou plus, le SPFG ne peut pas être représenté de façon exacte. Également, nous introduisons un cadre flexible qui utilise des méthodes visant la stabilité inspirées par la littérature pour identifier, à l'aide de notre représentation, une issue stable qui maximise le bien-être social des joueurs. Nous démontrons la valeur de notre représentation (approximée) compacte et de notre approche pour sélectionner une partition stable et une allocation des profits dans une application de marché compétitif provenant de la littérature. Théorie des jeux Théorie des jeux coopératifs Jeux à fonction de partition Programmation mixte en nombres entiers Game Theory Cooperative game theory Partition function games Mixed-integer programming
27	Quantum Emulation with Probabilistic Computers Shuvro Chowdhury (14030571) 31 October 2022 (has links) <p>The recent groundbreaking demonstrations of quantum supremacy in noisy intermediate scale quantum (NISQ) computing era has triggered an intense activity in establishing finer boundaries between classical and quantum computing. In this dissertation, we use established techniques based on quantum Monte Carlo (QMC) to map quantum problems into probabilistic networks where the fundamental unit of computation, p-bit, is inherently probabilistic and can be tuned to fluctuate between ‘0’ and ‘1’ with desired probability. We can view this mapped network as a Boltzmann machine whose states each represent a Feynman path leading from an initial configuration of q-bits to a final configuration. Each such path, in general, has a complex amplitude, ψ which can be associated with a complex energy. The real part of this energy can be used to generate samples of Feynman paths in the usual way, while the imaginary part is accounted for by treating the samples as complex entities, unlike ordinary Boltzmann machines where samples are positive. This mapping of a quantum circuit onto a Boltzmann machine with complex energies should be particularly useful in view of the advent of special-purpose hardware accelerators known as Ising Machines which can obtain a very large number of samples per second through massively parallel operation. We also demonstrate this acceleration using a recently used quantum problem and speeding its QMC simulation by a factor of ∼ 1000× compared to a highly optimized CPU program. Although this speed-up has been demonstrated using a graph colored architecture in FPGA, we project another ∼ 100× improvement with an architecture that utilizes clockless analog circuits. We believe that this will contribute significantly to the growing efforts to push the boundaries of the simulability of quantum circuits with classical/probabilistic resources and comparing them with NISQ-era quantum computers. </p> Digital processor architectures Nanoelectronics Quantum computation Quantum Computing Probabilistic Computing p-bit qubit quantum circuits Shor's algorithm transverse field Ising Hadamard Boltzmann machine Metropolis–Hasting algorithm Suzuki-Trotter Transform Partition Function Heisenberg Model
28	Improving sampling, optimization and feature extraction in Boltzmann machines Desjardins, Guillaume 12 1900 (has links) L’apprentissage supervisé de réseaux hiérarchiques à grande échelle connaît présentement un succès fulgurant. Malgré cette effervescence, l’apprentissage non-supervisé représente toujours, selon plusieurs chercheurs, un élément clé de l’Intelligence Artificielle, où les agents doivent apprendre à partir d’un nombre potentiellement limité de données. Cette thèse s’inscrit dans cette pensée et aborde divers sujets de recherche liés au problème d’estimation de densité par l’entremise des machines de Boltzmann (BM), modèles graphiques probabilistes au coeur de l’apprentissage profond. Nos contributions touchent les domaines de l’échantillonnage, l’estimation de fonctions de partition, l’optimisation ainsi que l’apprentissage de représentations invariantes. Cette thèse débute par l’exposition d’un nouvel algorithme d'échantillonnage adaptatif, qui ajuste (de fa ̧con automatique) la température des chaînes de Markov sous simulation, afin de maintenir une vitesse de convergence élevée tout au long de l’apprentissage. Lorsqu’utilisé dans le contexte de l’apprentissage par maximum de vraisemblance stochastique (SML), notre algorithme engendre une robustesse accrue face à la sélection du taux d’apprentissage, ainsi qu’une meilleure vitesse de convergence. Nos résultats sont présent ́es dans le domaine des BMs, mais la méthode est générale et applicable à l’apprentissage de tout modèle probabiliste exploitant l’échantillonnage par chaînes de Markov. Tandis que le gradient du maximum de vraisemblance peut-être approximé par échantillonnage, l’évaluation de la log-vraisemblance nécessite un estimé de la fonction de partition. Contrairement aux approches traditionnelles qui considèrent un modèle donné comme une boîte noire, nous proposons plutôt d’exploiter la dynamique de l’apprentissage en estimant les changements successifs de log-partition encourus à chaque mise à jour des paramètres. Le problème d’estimation est reformulé comme un problème d’inférence similaire au filtre de Kalman, mais sur un graphe bi-dimensionnel, où les dimensions correspondent aux axes du temps et au paramètre de température. Sur le thème de l’optimisation, nous présentons également un algorithme permettant d’appliquer, de manière efficace, le gradient naturel à des machines de Boltzmann comportant des milliers d’unités. Jusqu’à présent, son adoption était limitée par son haut coût computationel ainsi que sa demande en mémoire. Notre algorithme, Metric-Free Natural Gradient (MFNG), permet d’éviter le calcul explicite de la matrice d’information de Fisher (et son inverse) en exploitant un solveur linéaire combiné à un produit matrice-vecteur efficace. L’algorithme est prometteur: en terme du nombre d’évaluations de fonctions, MFNG converge plus rapidement que SML. Son implémentation demeure malheureusement inefficace en temps de calcul. Ces travaux explorent également les mécanismes sous-jacents à l’apprentissage de représentations invariantes. À cette fin, nous utilisons la famille de machines de Boltzmann restreintes “spike & slab” (ssRBM), que nous modifions afin de pouvoir modéliser des distributions binaires et parcimonieuses. Les variables latentes binaires de la ssRBM peuvent être rendues invariantes à un sous-espace vectoriel, en associant à chacune d’elles, un vecteur de variables latentes continues (dénommées “slabs”). Ceci se traduit par une invariance accrue au niveau de la représentation et un meilleur taux de classification lorsque peu de données étiquetées sont disponibles. Nous terminons cette thèse sur un sujet ambitieux: l’apprentissage de représentations pouvant séparer les facteurs de variations présents dans le signal d’entrée. Nous proposons une solution à base de ssRBM bilinéaire (avec deux groupes de facteurs latents) et formulons le problème comme l’un de “pooling” dans des sous-espaces vectoriels complémentaires. / Despite the current widescale success of deep learning in training large scale hierarchical models through supervised learning, unsupervised learning promises to play a crucial role towards solving general Artificial Intelligence, where agents are expected to learn with little to no supervision. The work presented in this thesis tackles the problem of unsupervised feature learning and density estimation, using a model family at the heart of the deep learning phenomenon: the Boltzmann Machine (BM). We present contributions in the areas of sampling, partition function estimation, optimization and the more general topic of invariant feature learning. With regards to sampling, we present a novel adaptive parallel tempering method which dynamically adjusts the temperatures under simulation to maintain good mixing in the presence of complex multi-modal distributions. When used in the context of stochastic maximum likelihood (SML) training, the improved ergodicity of our sampler translates to increased robustness to learning rates and faster per epoch convergence. Though our application is limited to BM, our method is general and is applicable to sampling from arbitrary probabilistic models using Markov Chain Monte Carlo (MCMC) techniques. While SML gradients can be estimated via sampling, computing data likelihoods requires an estimate of the partition function. Contrary to previous approaches which consider the model as a black box, we provide an efficient algorithm which instead tracks the change in the log partition function incurred by successive parameter updates. Our algorithm frames this estimation problem as one of filtering performed over a 2D lattice, with one dimension representing time and the other temperature. On the topic of optimization, our thesis presents a novel algorithm for applying the natural gradient to large scale Boltzmann Machines. Up until now, its application had been constrained by the computational and memory requirements of computing the Fisher Information Matrix (FIM), which is square in the number of parameters. The Metric-Free Natural Gradient algorithm (MFNG) avoids computing the FIM altogether by combining a linear solver with an efficient matrix-vector operation. The method shows promise in that the resulting updates yield faster per-epoch convergence, despite being slower in terms of wall clock time. Finally, we explore how invariant features can be learnt through modifications to the BM energy function. We study the problem in the context of the spike & slab Restricted Boltzmann Machine (ssRBM), which we extend to handle both binary and sparse input distributions. By associating each spike with several slab variables, latent variables can be made invariant to a rich, high dimensional subspace resulting in increased invariance in the learnt representation. When using the expected model posterior as input to a classifier, increased invariance translates to improved classification accuracy in the low-label data regime. We conclude by showing a connection between invariance and the more powerful concept of disentangling factors of variation. While invariance can be achieved by pooling over subspaces, disentangling can be achieved by learning multiple complementary views of the same subspace. In particular, we show how this can be achieved using third-order BMs featuring multiplicative interactions between pairs of random variables. Réseaux de neurones Apprentissage profond Apprentissage non-supervisé Apprentissage de représentations Machines de Boltzmann Échantillonnage Gradient naturel Modèles bilinéaires Fonction de partition Neural networks Deep learning Unsupervised learning Feature learning Boltzmann machines Markov chain Monte Carlo Parallel tempering Natural gradient Bilinear models Partition function
29	Improving sampling, optimization and feature extraction in Boltzmann machines Desjardins, Guillaume 12 1900 (has links) L’apprentissage supervisé de réseaux hiérarchiques à grande échelle connaît présentement un succès fulgurant. Malgré cette effervescence, l’apprentissage non-supervisé représente toujours, selon plusieurs chercheurs, un élément clé de l’Intelligence Artificielle, où les agents doivent apprendre à partir d’un nombre potentiellement limité de données. Cette thèse s’inscrit dans cette pensée et aborde divers sujets de recherche liés au problème d’estimation de densité par l’entremise des machines de Boltzmann (BM), modèles graphiques probabilistes au coeur de l’apprentissage profond. Nos contributions touchent les domaines de l’échantillonnage, l’estimation de fonctions de partition, l’optimisation ainsi que l’apprentissage de représentations invariantes. Cette thèse débute par l’exposition d’un nouvel algorithme d'échantillonnage adaptatif, qui ajuste (de fa ̧con automatique) la température des chaînes de Markov sous simulation, afin de maintenir une vitesse de convergence élevée tout au long de l’apprentissage. Lorsqu’utilisé dans le contexte de l’apprentissage par maximum de vraisemblance stochastique (SML), notre algorithme engendre une robustesse accrue face à la sélection du taux d’apprentissage, ainsi qu’une meilleure vitesse de convergence. Nos résultats sont présent ́es dans le domaine des BMs, mais la méthode est générale et applicable à l’apprentissage de tout modèle probabiliste exploitant l’échantillonnage par chaînes de Markov. Tandis que le gradient du maximum de vraisemblance peut-être approximé par échantillonnage, l’évaluation de la log-vraisemblance nécessite un estimé de la fonction de partition. Contrairement aux approches traditionnelles qui considèrent un modèle donné comme une boîte noire, nous proposons plutôt d’exploiter la dynamique de l’apprentissage en estimant les changements successifs de log-partition encourus à chaque mise à jour des paramètres. Le problème d’estimation est reformulé comme un problème d’inférence similaire au filtre de Kalman, mais sur un graphe bi-dimensionnel, où les dimensions correspondent aux axes du temps et au paramètre de température. Sur le thème de l’optimisation, nous présentons également un algorithme permettant d’appliquer, de manière efficace, le gradient naturel à des machines de Boltzmann comportant des milliers d’unités. Jusqu’à présent, son adoption était limitée par son haut coût computationel ainsi que sa demande en mémoire. Notre algorithme, Metric-Free Natural Gradient (MFNG), permet d’éviter le calcul explicite de la matrice d’information de Fisher (et son inverse) en exploitant un solveur linéaire combiné à un produit matrice-vecteur efficace. L’algorithme est prometteur: en terme du nombre d’évaluations de fonctions, MFNG converge plus rapidement que SML. Son implémentation demeure malheureusement inefficace en temps de calcul. Ces travaux explorent également les mécanismes sous-jacents à l’apprentissage de représentations invariantes. À cette fin, nous utilisons la famille de machines de Boltzmann restreintes “spike & slab” (ssRBM), que nous modifions afin de pouvoir modéliser des distributions binaires et parcimonieuses. Les variables latentes binaires de la ssRBM peuvent être rendues invariantes à un sous-espace vectoriel, en associant à chacune d’elles, un vecteur de variables latentes continues (dénommées “slabs”). Ceci se traduit par une invariance accrue au niveau de la représentation et un meilleur taux de classification lorsque peu de données étiquetées sont disponibles. Nous terminons cette thèse sur un sujet ambitieux: l’apprentissage de représentations pouvant séparer les facteurs de variations présents dans le signal d’entrée. Nous proposons une solution à base de ssRBM bilinéaire (avec deux groupes de facteurs latents) et formulons le problème comme l’un de “pooling” dans des sous-espaces vectoriels complémentaires. / Despite the current widescale success of deep learning in training large scale hierarchical models through supervised learning, unsupervised learning promises to play a crucial role towards solving general Artificial Intelligence, where agents are expected to learn with little to no supervision. The work presented in this thesis tackles the problem of unsupervised feature learning and density estimation, using a model family at the heart of the deep learning phenomenon: the Boltzmann Machine (BM). We present contributions in the areas of sampling, partition function estimation, optimization and the more general topic of invariant feature learning. With regards to sampling, we present a novel adaptive parallel tempering method which dynamically adjusts the temperatures under simulation to maintain good mixing in the presence of complex multi-modal distributions. When used in the context of stochastic maximum likelihood (SML) training, the improved ergodicity of our sampler translates to increased robustness to learning rates and faster per epoch convergence. Though our application is limited to BM, our method is general and is applicable to sampling from arbitrary probabilistic models using Markov Chain Monte Carlo (MCMC) techniques. While SML gradients can be estimated via sampling, computing data likelihoods requires an estimate of the partition function. Contrary to previous approaches which consider the model as a black box, we provide an efficient algorithm which instead tracks the change in the log partition function incurred by successive parameter updates. Our algorithm frames this estimation problem as one of filtering performed over a 2D lattice, with one dimension representing time and the other temperature. On the topic of optimization, our thesis presents a novel algorithm for applying the natural gradient to large scale Boltzmann Machines. Up until now, its application had been constrained by the computational and memory requirements of computing the Fisher Information Matrix (FIM), which is square in the number of parameters. The Metric-Free Natural Gradient algorithm (MFNG) avoids computing the FIM altogether by combining a linear solver with an efficient matrix-vector operation. The method shows promise in that the resulting updates yield faster per-epoch convergence, despite being slower in terms of wall clock time. Finally, we explore how invariant features can be learnt through modifications to the BM energy function. We study the problem in the context of the spike & slab Restricted Boltzmann Machine (ssRBM), which we extend to handle both binary and sparse input distributions. By associating each spike with several slab variables, latent variables can be made invariant to a rich, high dimensional subspace resulting in increased invariance in the learnt representation. When using the expected model posterior as input to a classifier, increased invariance translates to improved classification accuracy in the low-label data regime. We conclude by showing a connection between invariance and the more powerful concept of disentangling factors of variation. While invariance can be achieved by pooling over subspaces, disentangling can be achieved by learning multiple complementary views of the same subspace. In particular, we show how this can be achieved using third-order BMs featuring multiplicative interactions between pairs of random variables. Réseaux de neurones Apprentissage profond Apprentissage non-supervisé Apprentissage de représentations Machines de Boltzmann Échantillonnage Gradient naturel Modèles bilinéaires Fonction de partition Neural networks Deep learning Unsupervised learning Feature learning Boltzmann machines Markov chain Monte Carlo Parallel tempering Natural gradient Bilinear models Partition function
30	Contributions au développement d'outils computationnels de design de protéine : méthodes et algorithmes de comptage avec garantie / Contribution to protein design tools : counting methods and algorithms Viricel, Clement 18 December 2017 (has links) Cette thèse porte sur deux sujets intrinsèquement liés : le calcul de la constante de normalisation d’un champ de Markov et l’estimation de l’affinité de liaison d’un complexe de protéines. Premièrement, afin d’aborder ce problème de comptage #P complet, nous avons développé Z, basé sur un élagage des quantités de potentiels négligeables. Il s’est montré plus performant que des méthodes de l’état de l’art sur des instances issues d’interaction protéine-protéine. Par la suite, nous avons développé #HBFS, un algorithme avec une garantie anytime, qui s’est révélé plus performant que son prédécesseur. Enfin, nous avons développé BTDZ, un algorithme exact basé sur une décomposition arborescente qui a fait ses preuves sur des instances issues d’interaction intermoléculaire appelées “superhélices”. Ces algorithmes s’appuient sur des méthodes issuse des modèles graphiques : cohérences locales, élimination de variable et décompositions arborescentes. A l’aide de méthodes d’optimisation existantes, de Z et des fonctions d’énergie de Rosetta, nous avons développé un logiciel open source estimant la constante d’affinité d’un complexe protéine protéine sur une librairie de mutants. Nous avons analysé nos estimations sur un jeu de données de complexes de protéines et nous les avons confronté à deux approches de l’état de l’art. Il en est ressorti que notre outil était qualitativement meilleur que ces méthodes. / This thesis is focused on two intrinsically related subjects : the computation of the normalizing constant of a Markov random field and the estimation of the binding affinity of protein-protein interactions. First, to tackle this #P-complete counting problem, we developed Z, based on the pruning of negligible potential quantities. It has been shown to be more efficient than various state-of-the-art methods on instances derived from protein-protein interaction models. Then, we developed #HBFS, an anytime guaranteed counting algorithm which proved to be even better than its predecessor. Finally, we developed BTDZ, an exact algorithm based on tree decomposition. BTDZ has already proven its efficiency on intances from coiled coil protein interactions. These algorithms all rely on methods stemming from graphical models : local consistencies, variable elimination and tree decomposition. With the help of existing optimization algorithms, Z and Rosetta energy functions, we developed a package that estimates the binding affinity of a set of mutants in a protein-protein interaction. We statistically analyzed our esti- mation on a database of binding affinities and confronted it with state-of-the-art methods. It appears that our software is qualitatively better than these methods. Modèle graphique Champ de Markov Réseau de fonctions de coût Comptage #P complet Fonction de partition Constante de normalisation Design computationnel de protéine Affinité de liaison Interaction protéine-protéine Algorithm Markov random field Cost function network Couning #P complete Partition function Normalizing constant Computational protein design Binding affinity Protein-protein interaction 510 004

Search results