Global ETD Search

11	Learning from noisy labelsby importance reweighting: : a deep learning approach Fang, Tongtong January 2019 (has links) Noisy labels could cause severe degradation to the classification performance. Especially for deep neural networks, noisy labels can be memorized and lead to poor generalization. Recently label noise robust deep learning has outperformed traditional shallow learning approaches in handling complex input data without prior knowledge of label noise generation. Learning from noisy labels by importance reweighting is well-studied. Existing work in this line using deep learning failed to provide reasonable importance reweighting criterion and thus got undesirable experimental performances. Targeting this knowledge gap and inspired by domain adaptation, we propose a novel label noise robust deep learning approach by importance reweighting. Noisy labeled training examples are weighted by minimizing the maximum mean discrepancy between the loss distributions of noisy labeled and clean labeled data. In experiments, the proposed approach outperforms other baselines. Results show a vast research potential of applying domain adaptation in label noise problem by bridging the two areas. Moreover, the proposed approach potentially motivate other interesting problems in domain adaptation by enabling importance reweighting to be used in deep learning. / Felaktiga annoteringar kan sänka klassificeringsprestanda.Speciellt för djupa nätverk kan detta leda till dålig generalisering. Nyligen har brusrobust djup inlärning överträffat andra inlärningsmetoder när det gäller hantering av komplexa indata Befintligta resultat från djup inlärning kan dock inte tillhandahålla rimliga viktomfördelningskriterier. För att hantera detta kunskapsgap och inspirerat av domänanpassning föreslår vi en ny robust djup inlärningsmetod som använder omviktning. Omviktningen görs genom att minimera den maximala medelavvikelsen mellan förlustfördelningen av felmärkta och korrekt märkta data. I experiment slår den föreslagna metoden andra metoder. Resultaten visar en stor forskningspotential för att tillämpa domänanpassning. Dessutom motiverar den föreslagna metoden undersökningar av andra intressanta problem inom domänanpassning genom att möjliggöra smarta omviktningar. noisy label importance reweighting deep learning domain adaptation annoterad data omviktning djupt lärande domänanpassning Computer and Information Sciences Data- och informationsvetenskap
12	Non-réponse totale dans les enquêtes de surveillance épidémiologique / Unit Nonresponse in Epidemiologic Surveillance Surveys Santin, Gaëlle 09 February 2015 (has links) La non-réponse, rencontrée dans la plupart des enquêtes épidémiologiques, est génératrice de biais de sélection (qui, dans ce cas est un biais de non-réponse) lorsqu’elle est liée aux variables d’intérêt. En surveillance épidémiologique, dont un des objectifs est d’estimer des prévalences, on a souvent recours à des enquêtes par sondage. On est alors confronté à la non-réponse totale et on peut utiliser des méthodes issues de la statistique d’enquête pour la corriger. Le biais de non-réponse peut être exprimé comme le produit de l’inverse du taux de réponse et de la covariance entre la probabilité de réponse et la variable d’intérêt. Ainsi, deux types de solution peuvent généralement être envisagés pour diminuer ce biais. La première consiste à chercher à augmenter le taux de réponse au moment de la planification de l’enquête. Cependant, la maximisation du taux de réponse peut entraîner d’autres types de biais, comme des biais de mesure. Dans la seconde, après avoir recueilli les données, on utilise des informations liées a priori aux variables d’intérêt et à la probabilité de réponse, et disponibles à la fois pour les répondants et les non-répondants pour calculer des facteurs correctifs. Cette solution nécessite donc de disposer d’informations sur l'ensemble de l'échantillon tiré au sort (que les personnes aient répondu ou non) ; or ces informations sont en général peu nombreuses. Les possibilités récentes d'accès aux bases médico-administratives (notamment celles de l'assurance maladie) ouvrent de nouvelles perspectives sur cet aspect.Les objectifs de ce travail, qui sont centrés sur les biais de non-réponse, étaient d’étudier l’apport de données supplémentaires (enquête complémentaire auprès de non-répondants et bases médico-administratives) et de discuter l’influence du taux de réponse sur l’erreur de non-réponse et l’erreur de mesure.L'analyse était centrée sur la surveillance épidémiologique des risques professionnels via l’exploitation des données de la phase pilote de la cohorte Coset-MSA à l’inclusion. Dans cette enquête, en plus des données recueillies par questionnaire (enquête initiale et enquête complémentaire auprès de non-répondants), des informations auxiliaires issues de bases médico-administratives (SNIIR-AM et MSA) étaient disponibles pour les répondants mais aussi pour les non-répondants à l’enquête par questionnaire.Les résultats montrent que les données de l’enquête initiale, qui présentait un taux de réponse de 24%, corrigées pour la non-réponse avec des informations auxiliaires directement liées à la thématique de l’enquête (la santé et le travail) fournissent des estimations de prévalence en général proches de celles obtenues grâce à la combinaison des données de l’enquête initiale et de l’enquête complémentaire (dont le taux de réponse atteignait 63%) après correction de la non réponse par ces mêmes informations auxiliaires. La recherche d'un taux de réponse maximal à l’aide d’une enquête complémentaire n’apparait donc pas nécessaire pour diminuer le biais de non réponse. Cette étude a néanmoins mis en avant l’existence de potentiels biais de mesure plus importants pour l’enquête initiale que pour l’enquête complémentaire. L’étude spécifique du compromis entre erreur de non-réponse et erreur de mesure montre que, pour les variables qui ont pu être étudiées, après correction de la non-réponse, la somme de l’erreur de non-réponse de l’erreur de mesure est équivalente dans l’enquête initiale et dans les enquêtes combinées (enquête initiale et complémentaire).Ce travail a montré l’intérêt des bases médico-administratives pour diminuer l’erreur de non-réponse et étudier les erreurs de mesure dans une enquête de surveillance épidémiologique. / Nonresponse occurs in most epidemiologic surveys and may generate selection bias (which is, in this case, a nonresponse bias) when it is linked to outcome variables. In epidemiologic surveillance, whose one of the purpose is to estimate prevalences, it is usual to use survey sampling. In this case, unit nonresponse occurs and it is possible to use methods coming from survey sampling to correct for nonresponse. Nonresponse bias can be expressed as the product of the inverse of the response rate and the covariance between the probability of response and the outcome variable. Thus, two options are available to reduce the effect of nonresponse. The first is to increase the response rate by developing appropriate strategies at the study design phase. However, the maximization of the response rate can prompt other kinds of bias, such as measurement bias. In the second option, after data collection, information associated with both nonresponse and the outcome variable, and available for both respondents and nonrespondents, can be used to calculate corrective factors. This solution requires having information on the complete random sample (respondents and nonrespondents); but this information is rarely sufficient. Recent possibilities to access administrative databases (particularly those pertaining to health insurance) offer new perspectives on this aspect.The objectives of this work focused on the nonresponse bias were to study the contribution of supplementary data (administrative databases and complementary survey among nonrespondents) and to discuss the influence of the response rate on the nonresponse error and the measurement error. The analyses focused on occupational health epidemiologic surveillance, using data (at inclusion) from the Coset-MSA cohort pilot study. In this study, in addition to the data collected by questionnaire (initial and complementary survey among nonrespondents), auxiliary information from health and occupational administrative databases was available for both respondents and nonrespondents.Results show that the data from the initial survey (response rate : 24%), corrected for nonresponse with information directly linked to the study subject (health and work) produce estimations of prevalence close to those obtained by combining data from the initial survey and the complementary survey (response rate : 63%), after nonresponse adjustment on the same auxiliary information. Using a complementary survey to attain a maximal response rate does not seem to be necessary in order to decrease nonresponse bias. Nevertheless, this study highlights potential measurement bias which could be more consequential for the initial survey than for the complementary survey. The specific study of the trade-off between nonresponse error and measurement error shows that, for the studied variables and after correction for nonresponse, the sum of the nonresponse error and the measurement error is equivalent in the initial survey and in the combined surveys (initial plus complementary survey). This work illustrated the potential of administrative databases for decreasing the nonresponse error and for evaluating measurement error in an epidemiologic surveillance survey. Surveillance Non-réponse totale Erreur de non-réponse Erreur de mesure Erreur totale Bases médico-administratives Repondération Biais Surveillance Unit nonresponse Nonresponse error Measurement error Total error Administrative databases Reweighting Bias
13	On the empirical measurement of inequality / De la mesure empirique des inégalités Flores, Ignacio 25 January 2019 (has links) Le 1er chapitre présente une série de 50 ans sur les hauts revenus chiliens basée sur des données fiscales et comptes nationaux. L’étude contredit les enquêtes, selon lesquelles les inégalités diminuent les 25 dernières années. Au contraire, elles changent de direction à partir de 2000. Le Chili est parmi les pays les plus inégalitaires de l’OCDE et l’Amérique latine. Le 2ème chapitre mesure la sous-estimation des revenus factoriels dans les données distributives. Les ménages ne reçoivent que 50% des revenus du capital brut, par opposition aux firmes. L’hétérogénéité des taux de réponse et autres problèmes font que les enquêtes ne capturent que 20% de ceux-ci, contre 70% du revenu du travail. Cela sous-estime l’inégalité,dont les estimations deviennent insensibles à la "capital share" et sa distribution. Je formalise à partir d’identités comptables pour ensuite calculer des effets marginaux et contributions aux variations d’inégalité. Le 3ème chapitre présente une méthode pour ajuster les enquêtes. Celles-ci capturent souvent mal le sommet de la distribution. La méthode présente plusieurs avantages par rapport aux options précédentes : elle est compatible avec les méthodes de calibration standard ; elle a des fondements probabilistes explicites et préserve la continuité des fonctions de densité ; elle offre une option pour surmonter les limites des supports d’enquête bornées; et elle préserve la structure de micro données en préservant la représentativité des variables sociodémographiques. Notre procédure est illustrée par des applications dans cinq pays, couvrant à la fois des contextes développés et moins développés. / The 1st chapter presents historical series of Chilean top income shares over a period of half a century, mostly using data from tax statistics and national accounts. The study contradicts evidence based on survey data, according to which inequality has fallen constantly over the past 25 years. Rather, it changes direction, increasing from around the year 2000. Chile ranks as one of the most unequal countries among both OECD and Latin American countries over the whole period of study. The 2nd chapter measures the underestimation of factor income in distributive data. I find that households receive only half of national gross capital income,as opposed to corporations. Due to heterogeneous non-response and misreporting, Surveys only capture 20% of it, vs. 70% of labor income. This understates inequality estimates, which become insensitive to the capital share and its distribution. I formalize this system based on accounting identities. I then compute marginal effects and contributions to changes in fractile shares. The 3rd chapter, presents a method to adjust surveys. These generally fail to capturethe top of the income distribution. It has several advantages over previous ones: it is consistent with standard survey calibration methods; it has explicit probabilistic foundations and preserves the continuity of density functions; it provides an option to overcome the limitations of bounded survey-supports; and it preserves the microdata structure of the survey. Inégalités Hauts revenus Enquêtes Pondération Chili Tendances historiques Données fiscales Étalonnage Inequality Top incomes Surveys Reweighting Chile Historical trends Tax data Calibration 330
14	Determining Protein Conformational Ensembles by Combining Machine Learning and SAXS / Bestämning av konformationsensembler hos protein genom att kombinera maskininlärning med SAXS Eriksson Lidbrink, Samuel January 2023 (has links) In structural biology, immense effort has been put into discovering functionally relevant atomic resolution protein structures. Still, most experimental, computational and machine learning-based methods alone struggle to capture all the functionally relevant states of many proteins without very involved and system-specific techniques. In this thesis, I propose a new broadly applicable method for determining an ensemble of functionally relevant protein structures. The method consists of (1) generating multiple protein structures from AlphaFold2 by stochastic subsampling of the multiple sequence alignment (MSA) depth, (2) screening these structures using small-angle X-ray scattering (SAXS) data and a structure validation scoring tool, (3) simulating the screened conformers using short molecular dynamics (MD) simulations and (4) refining the ensemble of simulated structures by reweighting it against SAXS data using a bayesian maximum entropy (BME) approach. I apply the method to the T-cell intracellular antigen-1 (TIA-1) protein and find that the generated ensemble is in good agreement with the SAXS data it is fitted to, in contrast to the original set of conformations from AF2. Additionally, the predicted radius of gyration is much more consistent with the experimental value than what is predicted from a 450 ns long MD simulation starting from a single structure. Finally, I cross-validate my findings against small-angle neutron scattering (SANS) data and find that the method-generated ensemble, although not in a perfect way, fits some of the SANS data much better than the ensemble from the long MD simulation. Since the method is fairly automatic, I argue that it could be used by non-experts in MD simulations and also in combination with more advanced methods for more accurate results. I also propose generalisations of the method by tuning it to different biological systems, by using other AI-based methods or a different type of experimental data. / Inom strukturbiologi har ett stort arbete lagts på att bestämma funktionellt relevanta proteinstrukturer på atomnivå. Dock så har de flesta experimentella, simuleringsbaserade och maskinlärningsbaserade metoderna svårigheter med att ensamma bestämma alla funktionellt relevanta strukturer utan väldigt involverade och system-specifika tekniker. I den här masteruppsatsen föreslår jag en ny allmänt applicerbar metod för att bestämma ensembler av funktionellt relevanta proteinstrukturer. Metoden består utav (1) generering av ett flertal proteinkonformationer från AlphaFold2 (AF2) genom att stokastiskt subsampla djupet för multisekvenslinjering, (2) välja ut en delmängd av dessa konformationer med hjälp av small angle X-ray scattering (SAXS) och ett strukturvalideringsverktyg, (3) simulera de utvlada konformationerna med hjälp av korta molekyldynamiksimuleringar (MD-simuleringar) och (4) förfina ensemblen av simulerade konformationer genom att vikta om dem utgående från SAXS-data med en Bayesian Maximum Entropy-metod. Jag applicerar min föreslagna metod på proteinet T-cell intracellular antigen-1 och finner att den genererade ensemblen har en god anpassning till den SAXS-profil den är anpassad till, till skillnad från ensemblen av konformationer direkt genererade av AF2. Dessutom är den förutspådda tröghetsradien mycket mer konsekvent med den experimentellt förutspådda radien än vad som förutspås utifrån en 450 ns lång MD-simulering utgående från en ensam struktur. Slutgiltigen korsvaliderar jag mina upptäckter mot data från small-angle neutron scattering (SANS) och finner att den metod-genererade ensemblen, om än inte på ett perfekt sätt, passar en del av SANS-datan mycket bättre än ensemblen från den långa MD simulationen. Då metoden är ganska automatisk så argumenterar jag för att den med fördel kan användas av icke-experter inom MD simuleringar och dessutom kombineras med mer avancerade metoder för ännu bättre resultat. Jag föreslår också generaliseringar av metoden genom att kunna anpassa den till olika biologiska system, genom att använda andra AI-baserade metoder eller att använda andra typer av experimentell data. AlphaFold AlphaFold2 SAXS protein conformational ensembles machine learning protein conformations reweighting SAXS AlphaFold AlphaFold2 proteinkonformationsensembler maskininlärning proteinkonformationer omviktning Physical Sciences Fysik
15	Gaussian Critical Line in Anisotropic Mixed Quantum Spin Chains / Gaußsche kritische Linie in anisotropen, gemischten Quantenspinketten Bischof, Rainer 18 March 2013 (has links) (PDF) By numerical methods, two models of anisotropic mixed quantum spin chains, consisting of spins of two different sizes, Sa = 1/2 and Sb = 1 as well as Sb = 3/2, are studied with respect to their critical properties at quantum phase transitions in a selected region of parameter space. The quantum spin chains are made up of basecells of four spins, according to the structure Sa − Sa − Sb − Sb. They are described by the XXZ Hamiltonian, that extends the quantum Heisenberg model by a variable anisotropic exchange interaction. As additional control parameter, an alternating exchange constant between nearest-neighbour spins is introduced. Insight gained by complementary application of exact diagonalization and quantum Monte Carlo simulations, as well as appropriate methods of analysis, is embedded in the broad existing knowledge on homogeneous quantum spin chains. In anisotropic homogeneous quantum spin chains, there exist phase boundaries with continuously varying critical exponents, the Gaussian critical lines, along which, in addition to standard scaling relations, further extended scaling relations hold. Reweighting methods, also applied to improved quantum Monte Carlo estimators, and finite-size scaling analysis of simulation data deliver a wealth of numerical results confirming the existence of a Gaussian critical line also in the mixed spin models considered. Extrapolation of exact data offers, apart from confirmation of simulation data, furthermore, insight into the conformal operator content of the model with Sb = 1. / Mittels numerischer Methoden werden zwei Modelle anisotroper gemischter Quantenspinketten, bestehend aus Spins zweier unterschiedlicher Größen, Sa = 1/2 und Sb = 1 sowie Sb = 3/2, hinsichtlich ihrer kritischen Eigenschaften an Quanten-Phasenübergängen in einem ausgewählten Parameterbereich untersucht. Die Quantenspinketten sind aus Basiszellen zu vier Spins, gemäß der Struktur Sa − Sa − Sb − Sb, aufgebaut. Sie werden durch den XXZ Hamiltonoperator beschrieben, der das isotrope Quanten-Heisenberg Modell um eine variable anistrope Austauschwechselwirkung erweitert. Als zusätzlicher Kontrollparameter wird eine alterniernde Kopplungskonstante zwischen unmittelbar benachbarten Spins eingeführt. Die durch komplementäre Anwendung exakter Diagonalisierung und Quanten-Monte-Carlo Simulationen, sowie entsprechender Analyseverfahren, gewonnenen Erkenntnisse werden in das umfangreiche existierende Wissen über homogene Quantenspinketten eingebettet. Im Speziellen treten in anisotropen homogenen Quantenspinketten Phasengrenzen mit kontinuierlich variierenden kritischen Exponenten auf, die Gaußschen kritischen Linien, auf denen neben den herkömmlichen auch erweiterte Skalenrelationen Gültigkeit besitzen. Umgewichtungsmethoden, speziell auch angewandt auf verbesserte Quanten-Monte-Carlo Schätzer, und Endlichkeitsskalenanalyse von Simulationsdaten liefern eine Fülle von numerischen Ergebnissen, die das Auftreten der Gaußschen kritischen Linie auch in den untersuchten gemischten Quantenspinketten bestätigen. Die Extrapolation exakter Daten bietet, neben der Bestätigung der Simulationsdaten, darüber hinaus Einblick in einen Teil des konformen Operatorinhalts des Modells mit Sb = 1. Quantum Spin Chains Quantum Monte Carlo Loop Algorithm Lanczos Quantum Phase Transition Critical Exponents Reweighting Logarithmic Corrections Quantenspinketten Quanten Monte Carlo Loop Algorithmus Lanczos Quanten Phasenübergänge Kritische Exponenten Umgewichtung Logarithmische Korrekturen ddc:530
16	[en] COMBINING STRATEGIES FOR ESTIMATION OF TREATMENT EFFECTS / [pt] COMBINANDO ESTRATÉGIAS PARA ESTIMAÇÃO DE EFEITOS DE TRATAMENTO RAFAEL DE CARVALHO CAYRES PINTO 19 January 2018 (has links) [pt] Uma ferramenta importante na avaliação de políticas econômicas é a estimação do efeito médio de um programa ou tratamento sobre uma variável de interesse. A principal dificuldade desse cálculo deve-se µa atribuição do tratamento aos potenciais participantes geralmente não ser aleatória, causando viés de seleção quando desconsiderada. Uma maneira de resolver esse problema é supor que o econometrista observa um conjunto de características determinantes, a menos de um componente estritamente aleatório, da participação. Sob esta hipótese, conhecida como Ignorabilidade, métodos semiparamétricos de estimação foram desenvolvidos, entre os quais a imputação de valores contrafactuais e a reponderação da amostra. Ambos são consistentes e capazes de atingir, assintoticamente, o limite de eficiência semiparamétrico. Entretanto, nas amostras frequentemente disponíveis, o desempenho desses métodos nem sempre é satisfatório. O objetivo deste trabalho é estudar como a combinação das duas estratégias pode produzir estimadores com melhores propriedades em amostras pequenas. Para isto, consideramos duas formas de integrar essas abordagens, tendo como referencial teórico a literatura de estimação duplamente robusta desenvolvida por James Robins e co-autores. Analisamos suas propriedades e discutimos por que podem superar o uso isolado de cada uma das técnicas que os compõem. Finalmente, comparamos, num exercício de Monte Carlo, o desempenho desses estimadores com os de imputação e reponderação. Os resultados mostram que a combinação de estratégias pode reduzir o viés e a variância, mas isso depende da forma como é implementada. Concluímos que a escolha dos parâmetros de suavização é decisiva para o desempenho da estimação em amostras de tamanho moderado. / [en] Estimation of mean treatment effect is an important tool for evaluating economic policy. The main difficulty in this calculation is caused by nonrandom assignment of potential participants to treatment, which leads to selection bias when ignored. A solution to this problem is to suppose the econometrician observes a set of covariates that determine participation, except for a strictly random component. Under this assumption, known as Ignorability, semiparametric methods were developed, including imputation of counterfactual outcomes and sample reweighing. Both are consistent and can asymptotically achieve the semiparametric efficiency bound. However, in sample sizes commonly available, their performance is not always satisfactory. The goal of this dissertation is to study how combining these strategies can lead to better estimation in small samples. We consider two different ways of merging these methods, based on Doubly Robust inference literature developed by James Robins and his co-authors, analyze their properties and discuss why they would overcome each of their components. Finally, we compare the proposed estimators to imputation and reweighing in a Monte Carlo exercise. Results show that while combined strategies may reduce bias and variance, it depends on the way it is implemented. We conclude that the choice of smoothness parameters is critical to obtain good estimates in moderate size samples. [pt] EFEITO MEDIO DO TRATAMENTO - EMT [en] AVERAGE TREATMENT EFFECT [pt] METODO DE MONTE CARLO [en] MONTE CARLO S METHOD [pt] IGNORABILIDADE [en] IGNORABILITY [pt] IMPUTACAO [en] IMPUTATION [pt] REPONDERACAO [en] REWEIGHTING [pt] ESTIMACAO DUPLAMENTE ROBUSTA [en] DOUBLY ROBUST ESTIMATION
17	Gaussian Critical Line in Anisotropic Mixed Quantum Spin Chains Bischof, Rainer 06 February 2013 (has links) By numerical methods, two models of anisotropic mixed quantum spin chains, consisting of spins of two different sizes, Sa = 1/2 and Sb = 1 as well as Sb = 3/2, are studied with respect to their critical properties at quantum phase transitions in a selected region of parameter space. The quantum spin chains are made up of basecells of four spins, according to the structure Sa − Sa − Sb − Sb. They are described by the XXZ Hamiltonian, that extends the quantum Heisenberg model by a variable anisotropic exchange interaction. As additional control parameter, an alternating exchange constant between nearest-neighbour spins is introduced. Insight gained by complementary application of exact diagonalization and quantum Monte Carlo simulations, as well as appropriate methods of analysis, is embedded in the broad existing knowledge on homogeneous quantum spin chains. In anisotropic homogeneous quantum spin chains, there exist phase boundaries with continuously varying critical exponents, the Gaussian critical lines, along which, in addition to standard scaling relations, further extended scaling relations hold. Reweighting methods, also applied to improved quantum Monte Carlo estimators, and finite-size scaling analysis of simulation data deliver a wealth of numerical results confirming the existence of a Gaussian critical line also in the mixed spin models considered. Extrapolation of exact data offers, apart from confirmation of simulation data, furthermore, insight into the conformal operator content of the model with Sb = 1. / Mittels numerischer Methoden werden zwei Modelle anisotroper gemischter Quantenspinketten, bestehend aus Spins zweier unterschiedlicher Größen, Sa = 1/2 und Sb = 1 sowie Sb = 3/2, hinsichtlich ihrer kritischen Eigenschaften an Quanten-Phasenübergängen in einem ausgewählten Parameterbereich untersucht. Die Quantenspinketten sind aus Basiszellen zu vier Spins, gemäß der Struktur Sa − Sa − Sb − Sb, aufgebaut. Sie werden durch den XXZ Hamiltonoperator beschrieben, der das isotrope Quanten-Heisenberg Modell um eine variable anistrope Austauschwechselwirkung erweitert. Als zusätzlicher Kontrollparameter wird eine alterniernde Kopplungskonstante zwischen unmittelbar benachbarten Spins eingeführt. Die durch komplementäre Anwendung exakter Diagonalisierung und Quanten-Monte-Carlo Simulationen, sowie entsprechender Analyseverfahren, gewonnenen Erkenntnisse werden in das umfangreiche existierende Wissen über homogene Quantenspinketten eingebettet. Im Speziellen treten in anisotropen homogenen Quantenspinketten Phasengrenzen mit kontinuierlich variierenden kritischen Exponenten auf, die Gaußschen kritischen Linien, auf denen neben den herkömmlichen auch erweiterte Skalenrelationen Gültigkeit besitzen. Umgewichtungsmethoden, speziell auch angewandt auf verbesserte Quanten-Monte-Carlo Schätzer, und Endlichkeitsskalenanalyse von Simulationsdaten liefern eine Fülle von numerischen Ergebnissen, die das Auftreten der Gaußschen kritischen Linie auch in den untersuchten gemischten Quantenspinketten bestätigen. Die Extrapolation exakter Daten bietet, neben der Bestätigung der Simulationsdaten, darüber hinaus Einblick in einen Teil des konformen Operatorinhalts des Modells mit Sb = 1. info:eu-repo/classification/ddc/530 ddc:530
18	Improvement of monte carlo algorithms and intermolecular potencials for the modelling of alkanois, ether thiophenes and aromatics Pérez Pellitero, Javier 05 October 2007 (has links) Durante la última década y paralelamente al incremento de la velocidad de computación, las técnicas de simulación molecular se han erigido como una importante herramienta para la predicción de propiedades físicas de sistemas de interés industrial. Estas propiedades resultan esenciales en las industrias química y petroquímica a la hora de diseñar, optimizar, simular o controlar procesos. El actual coste moderado de computadoras potentes hace que la simulación molecular se convierta en una excelente opción para proporcionar predicciones de dichas propiedades. En particular, la capacidad predictiva de estas técnicas resulta muy importante cuando en los sistemas de interés toman parte compuestos tóxicos o condiciones extremas de temperatura o presión debido a la dificultad que entraña la experimentación a dichas condiciones. La simulación molecular proporciona una alternativa a los modelos termofísicos utilizados habitualmente en la industria como es el caso de las ecuaciones de estado, modelos de coeficientes de actividad o teorías de estados correspondientes, que resultan inadecuados al intentar reproducir propiedades complejas de fluidos como es el caso de las de fluidos que presentan enlaces de hidrógeno, polímeros, etc. En particular, los métodos de Monte Carlo (MC) constituyen, junto a la dinámica molecular, una de las técnicas de simulación molecular más adecuadas para el cálculo de propiedades termofísicas. Aunque, por contra del caso de la dinámica molecular, los métodos de Monte Carlo no proporcionan información acerca del proceso molecular o las trayectorias moleculares, éstos se centran en el estudio de propiedades de equilibrio y constituyen una herramienta, en general, más eficiente para el cálculo del equilibrio de fases o la consideración de sistemas que presenten elevados tiempos de relajación debido a su bajos coeficientes de difusión y altas viscosidades. Los objetivos de esta tesis se centran en el desarrollo y la mejora tanto de algoritmos de simulación como de potenciales intermoleculares, factor considerado clave para el desarrollo de las técnicas de simulación de Monte Carlo. En particular, en cuanto a los algoritmos de simulación, la localización de puntos críticos de una manera precisa ha constituido un problema para los métodos habitualmente utilizados en el cálculo de equlibrio de fases, como es el método del colectivo de GIBBS. La aparición de fuertes fluctuaciones de densidad en la región crítica hace imposible obtener datos de simulación en dicha región, debido al hecho de que las simulaciones son llevadas a cabo en una caja de simulación de longitud finita que es superada por la longitud de correlación. Con el fin de proporcionar una ruta adecuada para la localización de puntos críticos tanto de componentes puros como mezclas binarias, la primera parte de esta tesis está dedicada al desarrollo y aplicación de métodos adecuados que permitan superar las dificultades encontradas en el caso de los métodos convencionales. Con este fin se combinan estudios de escalado del tamaño de sitema con técnicas de "Histogram Reweighting" (HR). La aplicación de estos métodos se ha mostrado recientemente como mucho mejor fundamentada y precisa para el cálculo de puntos críticos de sistemas sencillos como es el caso del fluido de LennardJones (LJ). En esta tesis, estas técnicas han sido combinadas con el objetivo de extender su aplicación a mezclas reales de interés industrial. Previamente a su aplicación a dichas mezclas reales, el fluido de LennardJones, capaz de reproducir el comportamiento de fluidos sencillos como es el caso de argón o metano, ha sido tomado como referencia en un paso preliminar. A partir de simulaciones realizadas en el colectivo gran canónico y recombinadas mediante la mencionada técnica de "Histogram Reweighting" se han obtenido los diagramas de fases tanto de fluidos puros como de mezclas binarias. A su vez se han localizado con una gran precisión los puntos críticos de dichos sistemas mediante las técnicas de escalado del tamaño de sistema. Con el fin de extender la aplicación de dichas técnicas a sistemas multicomponente, se han introducido modificaciones a los métodos de HR evitando la construcción de histogramas y el consecuente uso de recursos de memoria. Además, se ha introducido una metodología alternativa, conocida como el cálculo del cumulante de cuarto orden o parámetro de Binder, con el fin de hacer más directa la localización del punto crítico. En particular, se proponen dos posibilidades, en primer lugar la intersección del parámetro de Binder para dos tamaños de sistema diferentes, o la intersección del parámetro de Binder con el valor conocido de la correspondiente clase de universalidad combinado con estudios de escalado. Por otro lado, y en un segundo frente, la segunda parte de esta tesis está dedicada al desarrollo de potenciales intermoleculares capaces de describir las energías inter e intramoleculares de las moléculas involucradas en las simulaciones. En la última década se han desarrolldo diferentes modelos de potenciales para una gran variedad de compuestos. Uno de los más comunmente utilizados para representar hidrocarburos y otras moléculas flexibles es el de átomos unidos, donde cada grupo químico es representado por un potencial del tipo de LennardJones. El uso de este tipo de potencial resulta en una significativa disminución del tiempo de cálculo cuando se compara con modelos que consideran la presencia explícita de la totalidad de los átomos. En particular, el trabajo realizado en esta tesis se centra en el desarrollo de potenciales de átomos unidos anisotrópicos (AUA), que se caracterizan por la inclusión de un desplazamiento de los centros de LennardJones en dirección a los hidrógenos de cada grupo, de manera que esta distancia se convierte en un tercer parámetro ajustable junto a los dos del potencial de LennardJones.En la segunda parte de esta tesis se han desarrollado potenciales del tipo AUA4 para diferentes familias de compuesto que resultan de interés industrial como son los tiofenos, alcanoles y éteres. En el caso de los tiofenos este interés es debido a las cada vez más exigentes restricciones medioambientales que obligan a eliminar los compuestos con presencia de azufre. De aquí la creciente de necesidad de propiedades termodinámicas para esta familia de compuestos para la cual solo existe una cantidad de datos termodinámicos experimentales limitada. Con el fin de hacer posible la obtención de dichos datos a través de la simulación molecular hemos extendido el potencial intermolecular AUA4 a esta familia de compuestos. En segundo lugar, el uso de los compuestos oxigenados en el campo de los biocombustibles ha despertado un importante interés en la industria petroquímica por estos compuestos. En particular, los alcoholes más utilizados en la elaboración de los biocombustibles son el metanol y el etanol. Como en el caso de los tiofenos, hemos extendido el potencial AUA4 a esta familia de compuestos mediante la parametrización del grupo hidroxil y la inclusión de un grupo de cargas electrostáticas optimizadas de manera que reproduzcan de la mejor manera posible el potencial electrostático creado por una molecula de referencia en el vacío. Finalmente, y de manera análoga al caso de los alcanoles, el último capítulo de esta tesis la atención se centra en el desarrollo de un potencial AUA4 capaz de reproducir cuantitativamente las propiedades de coexistencia de la familia de los éteres, compuestos que son ampliamente utilizados como solventes. / Parallel with the increase of computer speed, in the last decade, molecular simulation techniques have emerged as important tools to predict physical properties of systems of industrial interest. These properties are essential in the chemical and petrochemical industries in order to perform process design, optimization, simulation and process control. The actual moderate cost of powerful computers converts molecular simulation into an excellent tool to provide predictions of such properties. In particular, the predictive capability of molecular simulation techniques becomes very important when dealing with extreme conditions of temperature and pressure as well as when toxic compounds are involved in the systems to be studied due to the fact that experimentation at such extreme conditions is difficult and expensive.Consequently, alternative processes must be considered in order to obtain the required properties. Chemical and petrochemical industries have made intensive use of thermophysical models including equations of state, activity coefficients models and corresponding state theories. These predictions present the advantage of providing good approximations with minimal computational needs. However, these models are often inadequate when only a limited amount of information is available to determine the necesary parameters, or when trying to reproduce complex fluid properties such as that of molecules which exhibit hydrogen bonding, polymers, etc. In addition, there is no way for dynamical properties to be estimated in a consistent manner.In this thesis, the HR and FSS techniques are combined with the main goal of extending the application of these methodologies to the calculation of the vaporliquid equilibrium and critical point of real mixtures. Before applying the methodologies to the real mixtures of industrial interest, the LennardJones fluid has been taken as a reference model and as a preliminary step. In this case, the predictions are affected only by the omnipresent statistical errors, but not by the accuracy of the model chosen to reproduce the behavior of the real molecules or the interatomic potential used to calculate the configurational energy of the system.The simulations have been performed in the grand canonical ensemble (GCMC)using the GIBBS code. Liquidvapor coexistences curves have been obtained from HR techniques for pure fluids and binary mixtures, while critical parameters were obtained from FSS in order to close the phase envelope of the phase diagrams. In order to extend the calculations to multicomponent systems modifications to the conventional HR techniques have been introduced in order to avoid the construction of histograms and the consequent need for large memory resources. In addition an alternative methodology known as the fourth order cumulant calculation, also known as the Binder parameter, has been implemented to make the location of the critical point more straightforward. In particular, we propose the use of the fourth order cumulant calculation considering two different possibilities: either the intersection of the Binder parameter for two different system sizes or the intersection of the Binder parameter with the known value for the system universality class combined with a FSS study. The development of transferable potential models able to describe the inter and intramolecular energies of the molecules involved in the simulations constitutes an important field in the improvement of Monte Carlo techniques. In the last decade, potential models, also referred to as force fields, have been developed for a wide range of compounds. One of the most common approaches for modeling hydrocarbons and other flexible molecules is the use of the unitedatoms model, where each chemical group is represented by one LennardJones center. This scheme results in a significant reduction of the computational time as compared to allatoms models since the number of pair interactions goes as the square of the number of sites. Improvements on the standard unitedatoms model, where typically a 612 LennardJones center of force is placed on top of the most significant atom, have been proposed. For instance, the AUA model consists of a displacement of the LennardJones centers of force towards the hydrogen atoms, converting the distance of displacement into a third adjustable parameter. In this thesis we have developed AUA 4 intermolecular potentials for three different families of compounds. The family of ethers is of great importance due to their applications as solvents. The other two families, thiophenes and alkanols, play an important roles in the oil and gas industry. Thiophene due to current and future environmental restrictions and alkanols due ever higher importance and presence of biofuels in this industry. Gran Canónico Linea puntos criticos finite size scaling Critical points line Mezclas binarias condiciones supercríticas Bnary mixtures henry constants fluid phase quilibria mixed-fielt theory anisotropic potentials Tionos molecular simulation equilibrio liquido-vapor Histogram Reweighting Ethers Alcohols Thiophenes Alcanoles Alcoholes Éteres colectivo de Gibbs aromáticos Equilibri fases Potenciales anisotrópicos Punt Critic Critical point Lennard-Jones Binder parameter simulacion molecular Monte Carlo 538.9 544 62

Search results