• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 17
  • 11
  • 8
  • 4
  • 4
  • 2
  • 2
  • 2
  • 2
  • 1
  • Tagged with
  • 56
  • 11
  • 9
  • 9
  • 8
  • 8
  • 7
  • 6
  • 6
  • 6
  • 5
  • 5
  • 5
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

The Estimation of Selected Physicochemical Properties of Organic Compounds

Al-Antary, Doaa Tawfiq, Al-Antary, Doaa Tawfiq January 2018 (has links)
Thermodynamic relationships are used to predict several physicochemical properties of organic compounds. As described in chapter one, the UPPER model (Unified Physicochemical Property Estimation Relationships) has been used to predict nine essential physicochemical properties of pure compounds. It was developed almost 25 years ago and has been validated by the Yalkowsky group for almost 2000 aliphatic, aromatic, and polyhalogenated hydrocarbons. UPPER is based on a group of additive and nonadditive descriptors along with a series of well-accepted thermodynamic relationships. In this model, the two-dimensional chemical structure is the only input needed. Chapter (1) extends the applicability of UPPER to hydrogen bonding and non-hydrogen bonding aromatic compounds with several functional groups such as alcohol, aldehyde, ketone, carboxylic acid, carbonate, carbamate, amine, amide, nitrile as well as aceto, and nitro compounds. The total data set includes almost 3000 compounds. Aside from the enthalpies and entropies of melting and boiling, no training set is used for the calculation of the properties. The results show that UPPER enables a reasonable estimation of all the considered properties. Chapter (2) uses modification of the van't Hoff equation to predict the solubility of organic compounds in dry octanol as explained in chapter two. The equation represents a linear relationship between the logarithm of the solubility of a solute in octanol to its melting temperature. More than 620 experimentally measured octanol solubilities, collected from the literature, are used to validate the equation without using any regression or fitting. The average absolute error of the prediction is 0.66 log units. Chapter (3) compares the use of a statistic based model for the prediction of aqueous solubility to the existing general solubility equation (GSE).
22

Robust Machine Learning QSPR Models for Recognizing High Performing MOFs for Pre-Combustion Carbon Capture and Using Molecular Simulation to Study Adsorption of Water and Gases in Novel MOFs

Dureckova, Hana January 2018 (has links)
Metal organic frameworks (MOFs) are a class of nanoporous materials composed through self-assembly of inorganic and organic structural building units (SBUs). MOFs show great promise for many applications due to their record-breaking internal surface areas and tunable pore chemistry. This thesis work focuses on gas separation applications of MOFs in the context of carbon capture and storage (CCS) technologies. CCS technologies are expected to play a key role in the mitigation of anthropogenic CO2 emissions in the near future. In the first part of the thesis, robust machine learning quantitative structure-property relationship (QSPR) models are developed to predict CO2 working capacity and CO2/H2 selectivity for pre-combustion carbon capture using the most topologically diverse database of hypothetical MOF structures constructed to date (358,400 MOFs, 1166 network topologies). The support vector regression (SVR) models are developed on a training set of 35,840 MOFs (10% of the database) and validated on the remaining 322,560 MOFs. The most accurate models for CO2 working capacities (R2 = 0.944) and CO2/H2 selectivities (R2 = 0.876) are built from a combination of six geometric descriptors and three novel y-range normalized atomic-property-weighted radial distribution function (AP-RDF) descriptors. 309 common MOFs are identified between the grand canonical Monte Carlo (GCMC) calculated and SVR-predicted top-1000 high-performing MOFs ranked according to a normalized adsorbent performance score. This work shows that SVR models can indeed account for the topological diversity exhibited by MOFs. In the second project of this thesis, computational simulations are performed on a MOF, CALF-20, to examine its chemical and physical properties which are linked to its exceptional water-resisting ability. We predict the atomic positions in the crystal structure of the bulk phase of CALF-20, for which only a powder X-ray diffraction pattern is available, from a single crystal X-ray diffraction pattern of a metastable phase of CALF-20. Using the predicted CALF-20 structure, we simulate adsorption isotherms of CO2 and N2 under dry and humid conditions which are in excellent agreement with experiment. Snapshots of the CALF-20 undergoing water sorption simulations reveal that water molecules in a given pore adsorb and desorb together due to hydrogen bonding. Binding sites and binding energies of CO2 and water in CALF-20 show that the preferential CO2 uptake at low relative humidities is driven by the stronger binding energy of CO2 in the MOF, and the sharp increase in water uptake at higher relative humidities is driven by the strong intermolecular interactions between water. In the third project of this thesis, we use computational simulations to investigate the effects of residual solvent on Ni-BPM’s CH4 and N2 adsorption properties. Single crystal X-ray diffraction data shows that there are two sets of positions (Set 1 and 2) that can be occupied by the 10 residual DMSO molecules in the Ni-BPM framework. GCMC simulations of CH4 and N2 uptake in Ni-BPM reveal that CH4 uptake is in closest agreement with experiment when the 10 DMSO’s are placed among the two sets of positions in equal ratio (Mixed Set). Severe under-prediction and over-prediction of CH4 uptake are observed when the DMSO’s are placed in Set1 and Set 2 positions, respectively. Through binding site analysis, the CH4 binding sites within the Ni-BPM framework are found to overlap with the Set 1 DMSO positions but not with the Set 2 DMSO positions which explains the deviations in CH4 uptake observed for these cases. Binding energy calculations reveal that CH4 molecules are most stabilized when the DMSO’s are in the Mixed Set of positions.
23

Computer-Aided Molecular Design (CAMD) Using Signature Molecular Descriptors To Identify New Corrosion Inhibitors for Steel Reinforced Concrete

Mohamed, Ahmed 02 August 2023 (has links)
No description available.
24

Modelado predictivo de sistemas complejos para informática molecular : desarrollo de métodos de selección y aprendizaje de características en presencia de incertidumbre

Cravero, Fiorella 13 March 2020 (has links)
En la actualidad existe una necesidad creciente de guiar el descubrimiento in silico de nuevos polímeros industriales mediante enfoques de Aprendizaje Maquinal supervisado que identifiquen correlaciones estructura-propiedad a partir de la información contenida en bases de datos de materiales, donde cada uno de estos está caracterizado mediante Descriptores Moleculares (DMs). Estas correlaciones se conocen como Modelos de Relación Cuantitativa Estructura-Actividad/Propiedad (QSAR/QSPR, por las siglas en inglés de Quantitative Structure-Activity/Property Relationship) y pueden ser empleadas para predecir propiedades de interés previo a la etapa de síntesis química, contribuyendo de este modo a acelerar el diseño de nuevos materiales y reducir sus costos de desarrollo. El modelado QSAR/QSPR ya ha sido ampliamente empleado en Informática Molecular para el Diseño Racional de Fármacos asistido por computadoras. Sin embargo, los materiales poliméricos son significativamente más complejos que las moléculas pequeñas como las drogas, dado que están integrados por colecciones de macromoléculas compuestas por miles de cadenas que, a su vez, se forman por la unión de cientos de miles de Unidades Repetitivas Estructurales (UREs). Estas cadenas poseen diferentes pesos moleculares (o largos de cadena) y, a su vez, aparecen con distintas frecuencias dentro de cada material. Este fenómeno, conocido como polidispersión, es la principal razón de que muchas aproximaciones informáticas desarrolladas para el diseño racional de fármacos no sean directamente aplicables, ni lo suficientemente efectivas, en el ámbito de la Informática de Polímeros. El objetivo general de esta tesis es contribuir con soluciones para distintas cuestiones relativas a la representación computacional y algoritmia que surgen durante el modelado QSPR de propiedades de polímeros polidispersos de alto peso molecular, con especial énfasis en el tratamiento del problema de selección de descriptores moleculares. Las variaciones en la frecuencia de las cadenas de diferentes largos hacen que la descripción de la estructura de un material polimérico contenga incertidumbre, en contraste con lo que sucede en la caracterización estructural típica de una molécula pequeña. No obstante esto, debido a la complejidad de modelar esta incertidumbre, la mayoría de los estudios QSAR/QSPR han utilizado hasta ahora modelos moleculares simples y univaluados, es decir, calculan los descriptores moleculares para una única instancia de peso, de entre todas las posibles cadenas que conforman un material. En particular, la casi totalidad de estos estudios usan descriptores calculados sobre una única URE, sin tener en cuenta la polidispersión. En tal sentido, esta tesis propone investigar distintas alternativas de selección y aprendizaje de características para modelado QSPR con incertidumbre, que exploren la efectividad de otras representaciones computacionales más realistas para los materiales poliméricos. En primer lugar, se presenta una metodología híbrida que emplea tanto algoritmos de Selección de Características como de Aprendizaje de Características, a fin de evaluar la máxima capacidad predictiva que se puede alcanzar con la tradicional representación univaluada URE. En segundo lugar, se proponen nuevas representaciones univaluadas, basadas en pesos moleculares promedios, denominadas como modelos moleculares Mn y Mw, cuyas capacidades para inferir modelos QSPR son contrastadas con el modelo molecular URE. La siguiente alternativa propuesta estudia una representación computacional trivaluada, basada en la integración de los modelos moleculares univaluados URE, Mn y Mw en una única base de datos, la cual permite capturar parcialmente el fenómeno de la polidispersión. Esta caracterización computacional logra mejorar la generalizabilidad de los modelos QSPR obtenidos durante el proceso aprendizaje supervisado, en comparación con los inferidos mediante enfoques de representación univaluados. Sin embargo, esta nueva representación sigue sin contemplar las frecuencias de aparición de los distintos largos de cadena dentro de un material. Por último, como contribución final de esta tesis se propone una representación computacional multivaluada, basada en el perfil polidisperso real de un material, donde cada descriptor queda caracterizado por una distribución probabilística discreta. En este contexto, las técnicas de selección de características empleadas para representaciones univaluadas ya no resultan aplicables, y surge la necesidad de contar con algoritmos que permitan operar sobre este nuevo modelo molecular. Como consecuencia de esto, se presenta el diseño e implementación de un algoritmo para selección de características multivaluadas. Este nuevo método, FS4RVDD (como sigla de su nombre en inglés Feature Selection for Random Variables with Discrete Distribution), logra un desempeño prometedor en todos los escenarios experimentales ensayados en estas investigaciones. / Nowadays, there is an increasing need to lead the in silico discovery of new industrial polymers through supervised Machine Learning approaches that identify structure-property correlations from the information contained in material databases, where each of them is characterized by Molecular Descriptors (MDs). These correlations are known as Quantitative Structure-Activity/Property Relationship models (QSAR/QSPR). They can be used to predict desirable properties of new materials before the synthesis stage, contributing to accelerate the design of new materials and to reduce the associated development costs. QSAR/QSPR modeling is widely used in Molecular Informatics for Computer-Aided Drug Design. However, polymeric materials are significantly more complex than small molecules such as drugs, since they are collections of macromolecules that consist of a large number of structural repetitive units (SRUs) linked together in thousands of chain-like structures. These chains have different molecular weights (or lengths) and, in turn, they appear with different frequencies within each material. This phenomenon, known as polydispersity, is the main reason why many approaches developed for rational drug design are neither directly applicable nor sufficiently effective in the field of Polymer Informatics. The main objective of this thesis is to contribute with solutions for various issues related to computational representation and algorithm development that arise during the QSPR modeling of properties of high molecular weight polydisperse polymers, with special emphasis on the Feature Selection problem. Because of frequency variations in the different chain lengths, the characterization of the polymeric material structure contains uncertainty, in contrast with the typical structural characterization of a small molecule. However, to deal with the uncertainty that introduces the polydispersity of polymeric materials, most of the QSAR/QSPR studies, until now, have used simple and univalued molecular models, that is, they calculate the molecular descriptors for a single instance of weight among all the possible chains that constitute a material. In particular, most QSPR studies use descriptors calculated on a single SRU, regardless of polydispersity. In this context, the present thesis proposes to investigate different alternatives of Feature Selection and Feature Learning for QSPR modeling with uncertainty that explore the effectiveness of more realistic computational representations for polymeric materials. First, a hybrid methodology that uses MDs from both Feature Selection and Feature Learning algorithms is presented to evaluate the maximum predictive capability the traditional univalued representation (URE) can achieved. Then, new univalued representations based on average molecular weights are proposed, called Mn molecular model and Mw molecular model, whose capabilities to infer QSPR models are contrasted with the URE molecular model ones. The other alternative computational representation proposes is trivalued MDs, based on the integration of URE, Mn, and Mw univalued molecular models into a single database. This representation partially captures the polydispersity inherent to polymers. This computational characterization improves the generalizability of QSPR models obtained during the supervised learning process, compared to those inferred through univalued representation approaches. However, this new trivalued representation still does not contemplate the frequencies of appearance of the different chain lengths within a material. Finally, this thesis contributes with a multivalued computational representation based on the actual polydisperse profile of a material, in which each descriptor is characterized by a probabilistic discrete distribution. In this context, the Feature Selection techniques used for univalued representations are no longer applicable, and there is a need for algorithms to deal with this new multivalued molecular model. To face this need, both the design and implementation of an algorithm for the selection of multivalued features are presented here. This new method is called Feature Selection for Random Variables with Discrete Distribution (FS4RVDD), and it achieves a promising performance in all the experimental scenarios tested in these investigations.
25

Predicción de propiedades de sustancias y materiales de interés en la industria química a través del desarrollo de métodos computacionales

Palomba, Damián 17 March 2014 (has links)
El objetivo de esta Tesis es desarrollar métodos computacionales predictivos para propiedades específicas de compuestos de interés en la industria química, particularmente en la industria farmacéutica y de materiales poliméricos. Para desarrollar la metodología de trabajo se utilizó como herramienta la técnica Relación Cuantitativa Estructura/Propiedad (QSPR) (Quantitative Structure/Property Relationship), que consiste en relacionar cuantitativamente diferentes parámetros de una entidad química (por ejemplo una molécula pequeña o un polímero) con una propiedad bien definida de la misma. Este trabajo se plantea como un estudio interdisciplinario, de forma tal que la técnica QSPR sea enriquecida con el conocimiento del ensayo de medición de las propiedades que se buscan predecir y fundamentalmente con los aspectos físico-químicos involucrados. La metodología de trabajo se aplicó en una primera instancia a la predicción de propiedades de drogas y compuestos orgánicos en general y, en una segunda, a propiedades de materiales poliméricos. Las propiedades que se exploraron vinculadas a las drogas y compuestos orgánicos fueron algunas de las físico-químicas relacionadas al comportamiento ADMET (absorción, distribución, metabolismo, excreción y toxicidad) de los mismos. Estas fueron la absorción intestinal humana (AIH) (Human Intestinal Absorption) y el pasaje de la barrera hemato-encefálica (BHE) (Blood-Brain Barrier), ambas esenciales para el desarrollo de nuevos fármacos. Asimismo, se estudiaron los compuestos orgánicos volátiles (VOCs) (volatile organic compounds) que son gases emitidos de ciertos sólidos o líquidos. Se predijeron sus coeficientes de distribución sangre-hígado (log Pliver), que se pueden emplear en la evaluación de riesgos y toma de decisiones en políticas de salud pública. Por otro lado, con respecto al campo de los materiales poliméricos se exploraron diferentes propiedades. Una de ellas es una propiedad térmica, la temperatura de transición vítrea (Tg), la cual se relaciona con el desempeño mecánico y la procesabilidad del material; las restantes son propiedades mecánicas derivadas del ensayo de tracción en una dimensión: elongación a la rotura (Elongation at Break), resistencia a la rotura (Strength at Break) y módulo elástico o de Young (Tensile Modulus). Estas propiedades mecánicas brindan información relacionada con la ductilidad, resistencia y rigidez de un material polimérico respectivamente, y junto con otras definen su perfil de aplicación estructural. La Tesis se organiza, de modo general, en dos grandes bloques en relación con el material al cual se aplica la predicción: drogas y compuestos orgánicos volátiles (compuestos de interés farmacéutico y de salud pública) por un lado, y por el otro, materiales poliméricos (materiales de interés en la industria química). Esta estructura obedece a las significativas diferencias moleculares entre los compuestos de trabajo de los cuales se obtiene la propiedad a predecir, denominada propiedad objetivo o target, y por lo tanto de aquí surgen también los distintos enfoques con los que se plantearon cada una de las predicciones. La contribución original en el área de las drogas y compuestos orgánicos volátiles fue el desarrollo de nuevos modelos de predicción para las propiedades previamente mencionadas, mediante un enfoque semi-automático (un método de selección automática de variables combinado con una selección manual guiada por el conocimiento experto) que se puede aplicar también para modelar otras propiedades y otros compuestos. También el aporte del conocimiento físico-químico durante la fase de modelado conduciendo a modelos más aceptables, ya que son más fáciles de interpretar y tienden a generalizar mejor a los compuestos de diseño (virtuales), es decir compuestos aún no sintetizados. Con relación al campo de los materiales poliméricos, las contribuciones novedosas fueron generar diferentes modelos para predecir la propiedad térmica y las propiedades mecánicas nombradas. Se desarrolló un prototipo molecular sintético, consistente en una estructura trimérica, para representar a los polímeros. Se propusieron nuevos descriptores para materiales poliméricos mediante un enfoque original de las cadenas de los polímeros, distinguiendo los fragmentos que corresponden respectivamente a la cadena principal y a la cadena lateral. Se obtuvo un modelo de predicción para la Tg enriquecido con el conocimiento físico-químico subyacente del fenómeno estudiado y se presentó una explicación estructural detallada de los descriptores del modelo y su relación con la propiedad estudiada. Luego, se validó el prototipo molecular (trímero) en relación a estructuras más complejas (31 unidades repetitivas). Con respecto a las propiedades mecánicas, se presentó un set de datos de trabajo que se recopiló y depuró para polímeros sintéticos a partir de fuentes disponibles. Se propusieron descriptores: por un lado, nuevos de cadena de polímeros, y por el otro, parámetros experimentales. Finalmente, se demostró la utilidad de incorporar información experimental del ensayo de tensión junto con estrategias estructurales para abordar la predicción, generando así herramientas más inteligentes e interpretables para el diseño de nuevos materiales con un perfil de aplicación específico. / The goal of this Thesis is to develop predictive computational methods for specific properties of compounds of interest in the chemical industry, particularly in pharmaceutical and polymeric materials industry. In order to develop the working method, the Quantitative Structure/Property Relationship (QSPR) technique was utilized, which relates quantitatively different parameters of an entity (e.g. a molecule or polymer) with an own well-defined process, such as a property. This work is planned as an interdisciplinary study, with the aim of improving the QSPR technique by means of physicochemical comprehension and the knowledge of target property measurement test. Firstly, the method was applied to predict properties of drugs and general organic compounds and, secondly, to predict polymeric materials properties. Physicochemical properties related to the ADMET (absorption, distribution, metabolism, excretion and toxicity) behavior of drugs and organic compounds were explored. These were the Human Intestinal Absorption (HIA) and the Blood Brain Barrier (BBB) penetration, both essential for drug development. Furthermore, the volatile organic compounds (VOCs) were studied, which are gases emitted from certain solids or liquids. Their blood-to-liver partition coefficients (log Pliver) were predicted; it can be applied to risk assessment and decision making in public health policies. Regarding to the polymeric materials field, several properties were studied. One of them is a thermal property, the glass transition temperature (Tg), which is related to the processability and material mechanical performance; the remaining ones are tensile properties: elongation at break, strength at break, and tensile modulus. These mechanical properties provide information related to the ductility, strength, and stiffness of a polymeric material, respectively and, along with other ones, define its structural application profile. This Thesis can be broadly divided into two main categories, according to the material that prediction is performed: drugs and volatile organic compounds (compounds of interest in pharmaceutical industry and public health) on the one hand, and polymeric materials (materials of interest in the chemical industry) on the other. This structure is due to significant molecular differences between the working compounds (organic and polymeric materials) from which the property to predict is obtained (target property), and therefore to the different approaches whereby each prediction was addressed. The original contribution in the drugs and volatile organic compounds field was the development of new predictive models for the aforementioned properties, using a semi-automatic approach (an automatic-variable-selection method combined with a knowledge-aided-manual selection) that can also be applied so as to model another properties. Moreover, during the modeling phase, the contribution of the physicalchemical knowledge led to acceptable models since they are easier to interpret and tend to better generalize design compounds (virtual), i.e. not-yet-synthesized compounds. Regarding the polymeric materials science, the generation of different models for predicting the already mentioned thermal property and the mechanical properties was a novel contribution. A molecular prototype, consisting of a trimeric structure, was used in order to represent the polymers. New descriptors were proposed for polymeric materials by means of a polymer chains approach, the main and side chain. A prediction model for Tg was obtained, enriched by the underlying physicochemical knowledge from the studied phenomenon, and a detailed structural explanation of the model descriptors and its relation to the studied property was presented. Afterwards, the molecular prototype (trimer) was validated against to more complex structures (31 repeating units). With respect to tensile properties, a tailor-made dataset was presented. Several descriptors were proposed: new ones of polymer chain, and alternatively, experimental parameters. Finally, we demonstrated the usefulness of considering experimental information from the tensile test along with structural strategies to tackle the prediction, thereby more intelligent tools for the design of new materials with a specific application profile are provided.
26

Modeling and visualization of complex chemical data using local descriptors / La modélisation et la visualisation de données chimiques complexes en utilisant les descripteurs locaux

Glavatskikh, Marta 09 July 2018 (has links)
Cette étude considère des systèmes où non seulement la structure moléculaire, mais les conditions expérimentales sont impliquées. Les structures chimiques ont été codées par des descripteurs locaux ISIDA MA ou ISIDA CGR, ciblant spécifiquement les centres actifs et leur environnement le plus proche. Les descripteurs locaux ont été combinés avec les paramètres spécifiques des conditions expérimentales, codant ainsi un objet chimique particulier. La méthodologie a été appliquée avec succès pour la modélisation QSPR des paramètres thermodynamiques et cinétiques des interactions intermoléculaires (liaisons halogène et hydrogène), des équilibres tautomères et des réactions chimiques (cycloaddition et SN1). La méthode GTM a été appliquée pour la première fois pour la modélisation et la visualisation de données chimiques mixtes. La méthode sépare avec succès les groupes de données à la fois en raison des structures et des conditions. / This work describes original approaches for predictive chemoinformatics modeling of molecular interactions and reactions as a function of the structures of interacting partners and of the chemical environment (experimental conditions). Chemical structures have been encoded by local ISIDA MA-based or CGR-based descriptors, specifically targeting the active centers and their closest environment. The local descriptors have been combined with the specific parameters of experimental conditions, thereby encoding a particular chemical object. The methodology has been successfully applied for QSPR modeling of thermodynamic and kinetic parameters of intermolecular interactions (halogen and hydrogen bonds), tautomeric equilibria and chemical reactions (cycloaddition and SN1). GTM method has been applied for the first time for QSPR modeling and visualization of mixed chemical data. This method successfully separates data clusters on account of both chemical structures and experimental conditions.
27

Modélisation QSPR de solvants d’intérêt technologique : les liquides ioniques et les électrolytes pour batteries Li-ion / QSPR modelling of technologically interesting solvents : the ionic liquids and the electrolytes for Li-ion batteries

Delouis, Grace 26 September 2017 (has links)
Cette thèse a pour but de modéliser les liquides ioniques et les électrolytes pour batteries Li-ion. Nous avons développé des modèles SVR afin de prédire 9 propriétés d’intérêt pour ces solvants. Les modèles construits pour les liquides ioniques ont permis la détection de divers problèmes, et sont accessibles sur le site web du laboratoire : infochim.u-strasbg.fr/webserv/VSEngine.html. Les modèles construits pour les électrolytes ont permis la modélisation de candidats testés expérimentalement par nos collaborateurs. Le nombre de données étant limité pour ces solvants, nous avons également testé l’approche transductive par le biais de la TRR (Transductive Ridge Regression). Nous avons mis en place un protocole d’optimisation des paramètres de la méthode et appliqué la TRR aux solvants étudiés. Les résultats obtenus par la TRR sont légèrement meilleurs que ceux de la Régression Ridge, mais restent modestes si on veut éviter une détérioration accidentelle du modèle. / This thesis is dedicated to the modelling of ionic liquids and electrolytes of Li-ion batteries. We developed several SVR models in order to predict 9 interesting properties of these solvents. The models built for the ionic liquids allowed us to detect several problems, and are freely available on the laboratory’s website: infochim.u-strasbg.fr/webserv/VSEngine.html. The models built for the electrolytes were used to model some candidates tested experimentally by our colleagues. As the amount of data is quite small for these solvents, we also tested the transductive approach with the help of the TRR (Transductive Ridge Regression). We have developed an optimization procedure for the method’s parameters, and applied the TRR to the studied solvents. The results obtained with the TRR are slightly better than of the Ridge Regression but stay modest if we want to avoid any accidental damage of the model.
28

Modelos de predição do coeficiente de sorção no solo de pesticidas não iônicos: diferentes algoritmos de logP e uma abordagem alternativa de logS.

Reis, Ralpho Rinaldo dos 17 May 2013 (has links)
Made available in DSpace on 2017-05-12T14:46:52Z (GMT). No. of bitstreams: 1 Ralpho.pdf: 2205542 bytes, checksum: 37ae4ee862cc62b72b5ed65409967739 (MD5) Previous issue date: 2013-05-17 / Collecting data on pesticide effects on the environment and several ecosystems is a slow and costly process. Therefore, significant research efforts have been focused on developing mathematical models to predict physical, chemical or biological properties of environmental interest. The soil sorption coefficient normalized to organic carbon content (Koc) is a physicochemical key parameter used in environmental risk assessments of substances released into the environment. Thus, several logKoc prediction models that use hydrophobic parameter (logP) or the logarithm of water solubility (logS) as descriptor have been reported in the last decades. Mostly, due to the lack of reliable experimental values of logP or logS, algorithms are used to calculate such properties. Despite the availability and easiness to access several algorithms for this purpose, scientific studies do not describe the procedure adopted to choose the algorithm used in quantitative structure-property relationship (QSPR) studies. Furthermore, the strong correlation between logP and logS prevents their application in the same mathematical equation obtained by multiple linear regression method. Since the sorption process of a chemical compound in soil is related both to its water solubility and its water/organic matter partition, it is expected models that are able to combine these two properties will can record more realistic results. This doctoral dissertation consists of two scientific papers. In the first one, a study was carried out to check the influence of choosing logP algorithm on logKoc modeling. Models were constructed to relate logKoc with logP according to different freeware algorithms. All models were assessed based on their statistic qualities and predictive power. The obtained results clearly showed that an arbitrary choice of the algorithm may not result in the best prediction model. On the other hand, a good choice can lead to obtaining simple models with statistic qualities and predictive power comparable to more complex models. The second paper aims at proposing an alternative approach for logKoc modeling, using simple descriptor of solubility, here referred as logarithm of corrected solubility by octanol/water partition (logSP). Thus, models were built with this descriptor and also with logP and logS conventional descriptors, which are isolated or associated with other explicative variables of easy physicochemical interpretation. The obtained models were validated and compared to other models previously published. The results showed that the use of logSP descriptor to replace the conventional ones led to obtaining simple models with statistic qualities and predictive power that are higher than other more complex models already found in literature. / A coleta de dados relativos aos danos causados pelos pesticidas sobre o meio ambiente e seus ecossistemas é lenta e onerosa. Desta maneira, grandes incentivos têm sido destinados às pesquisas que visam à construção de modelos matemáticos para predição de propriedades físicas, químicas ou biológicas de interesse ambiental. O coeficiente de sorção no solo normalizado para o conteúdo de carbono orgânico (Koc) é um importante parâmetro físico-químico utilizado nas avaliações de riscos ambientais das substâncias lançadas no meio ambiente. Assim, vários modelos para predição de logKoc, utilizando o parâmetro hidrofóbico (logP) ou o logaritmo da solubilidade em água (logS) como descritores, têm sido publicados nas últimas décadas. Muitas vezes, em virtude da ausência de valores experimentais confiáveis de logP ou logS, são usados algoritmos para o cálculo dessas propriedades. Apesar da disponibilidade e facilidade de acesso a diversos algoritmos para tal finalidade, os artigos científicos não descrevem o procedimento adotado para escolha do algoritmo usado nos estudos QSPR. Além disto, a forte correlação entre logP e logS impede que sejam usados em uma mesma equação obtida por regressão linear múltipla. Como o processo de sorção de um composto químico no solo está relacionado tanto com sua solubilidade em água como com sua partição água/matéria orgânica, espera-se que modelos que sejam capazes de combinar essas duas informações possam gerar resultados mais realistas. Este trabalho de tese é constituído de dois artigos. No primeiro artigo, foi feito um estudo para verificar a influência da escolha do algoritmo de logP na modelagem de logKoc. Foram construídos modelos que relacionam logKoc com logP a partir de diferentes algoritmos livres disponíveis. Todos os modelos foram avaliados quanto às suas qualidades estatísticas e poder de predição. Os resultados obtidos mostraram claramente que uma escolha arbitrária deste algoritmo pode não levar ao melhor modelo de predição. Por outro lado, uma boa escolha pode conduzir à obtenção de modelos simples com qualidades estatísticas e poder de predição comparáveis a de modelos mais complexos. No segundo artigo, o objetivo foi a proposição de uma abordagem alternativa para a modelagem de logKoc, utilizando um descritor simples de solubilidade, aqui designado como logaritmo da solubilidade corrigida pela partição octanol/água (logSP). Assim, foram construídos modelos com tal descritor e também com os descritores convencionais logP e logS, isolados ou associados com outras variáveis explicativas de fácil interpretação físico-química. Os modelos obtidos foram validados e comparados com outros modelos publicados anteriormente. Os resultados mostraram que o uso do descritor logSp em substituição aos descritores convencionais conduziu à obtenção de modelos simples com qualidades estatísticas e poder de predição superiores a de outros modelos mais complexos encontrados na literatura.
29

Modelos de predição do coeficiente de sorção no solo de pesticidas não iônicos: diferentes algoritmos de logP e uma abordagem alternativa de logS.

Reis, Ralpho Rinaldo dos 17 May 2013 (has links)
Made available in DSpace on 2017-07-10T19:23:40Z (GMT). No. of bitstreams: 1 Ralpho.pdf: 2205542 bytes, checksum: 37ae4ee862cc62b72b5ed65409967739 (MD5) Previous issue date: 2013-05-17 / Collecting data on pesticide effects on the environment and several ecosystems is a slow and costly process. Therefore, significant research efforts have been focused on developing mathematical models to predict physical, chemical or biological properties of environmental interest. The soil sorption coefficient normalized to organic carbon content (Koc) is a physicochemical key parameter used in environmental risk assessments of substances released into the environment. Thus, several logKoc prediction models that use hydrophobic parameter (logP) or the logarithm of water solubility (logS) as descriptor have been reported in the last decades. Mostly, due to the lack of reliable experimental values of logP or logS, algorithms are used to calculate such properties. Despite the availability and easiness to access several algorithms for this purpose, scientific studies do not describe the procedure adopted to choose the algorithm used in quantitative structure-property relationship (QSPR) studies. Furthermore, the strong correlation between logP and logS prevents their application in the same mathematical equation obtained by multiple linear regression method. Since the sorption process of a chemical compound in soil is related both to its water solubility and its water/organic matter partition, it is expected models that are able to combine these two properties will can record more realistic results. This doctoral dissertation consists of two scientific papers. In the first one, a study was carried out to check the influence of choosing logP algorithm on logKoc modeling. Models were constructed to relate logKoc with logP according to different freeware algorithms. All models were assessed based on their statistic qualities and predictive power. The obtained results clearly showed that an arbitrary choice of the algorithm may not result in the best prediction model. On the other hand, a good choice can lead to obtaining simple models with statistic qualities and predictive power comparable to more complex models. The second paper aims at proposing an alternative approach for logKoc modeling, using simple descriptor of solubility, here referred as logarithm of corrected solubility by octanol/water partition (logSP). Thus, models were built with this descriptor and also with logP and logS conventional descriptors, which are isolated or associated with other explicative variables of easy physicochemical interpretation. The obtained models were validated and compared to other models previously published. The results showed that the use of logSP descriptor to replace the conventional ones led to obtaining simple models with statistic qualities and predictive power that are higher than other more complex models already found in literature. / A coleta de dados relativos aos danos causados pelos pesticidas sobre o meio ambiente e seus ecossistemas é lenta e onerosa. Desta maneira, grandes incentivos têm sido destinados às pesquisas que visam à construção de modelos matemáticos para predição de propriedades físicas, químicas ou biológicas de interesse ambiental. O coeficiente de sorção no solo normalizado para o conteúdo de carbono orgânico (Koc) é um importante parâmetro físico-químico utilizado nas avaliações de riscos ambientais das substâncias lançadas no meio ambiente. Assim, vários modelos para predição de logKoc, utilizando o parâmetro hidrofóbico (logP) ou o logaritmo da solubilidade em água (logS) como descritores, têm sido publicados nas últimas décadas. Muitas vezes, em virtude da ausência de valores experimentais confiáveis de logP ou logS, são usados algoritmos para o cálculo dessas propriedades. Apesar da disponibilidade e facilidade de acesso a diversos algoritmos para tal finalidade, os artigos científicos não descrevem o procedimento adotado para escolha do algoritmo usado nos estudos QSPR. Além disto, a forte correlação entre logP e logS impede que sejam usados em uma mesma equação obtida por regressão linear múltipla. Como o processo de sorção de um composto químico no solo está relacionado tanto com sua solubilidade em água como com sua partição água/matéria orgânica, espera-se que modelos que sejam capazes de combinar essas duas informações possam gerar resultados mais realistas. Este trabalho de tese é constituído de dois artigos. No primeiro artigo, foi feito um estudo para verificar a influência da escolha do algoritmo de logP na modelagem de logKoc. Foram construídos modelos que relacionam logKoc com logP a partir de diferentes algoritmos livres disponíveis. Todos os modelos foram avaliados quanto às suas qualidades estatísticas e poder de predição. Os resultados obtidos mostraram claramente que uma escolha arbitrária deste algoritmo pode não levar ao melhor modelo de predição. Por outro lado, uma boa escolha pode conduzir à obtenção de modelos simples com qualidades estatísticas e poder de predição comparáveis a de modelos mais complexos. No segundo artigo, o objetivo foi a proposição de uma abordagem alternativa para a modelagem de logKoc, utilizando um descritor simples de solubilidade, aqui designado como logaritmo da solubilidade corrigida pela partição octanol/água (logSP). Assim, foram construídos modelos com tal descritor e também com os descritores convencionais logP e logS, isolados ou associados com outras variáveis explicativas de fácil interpretação físico-química. Os modelos obtidos foram validados e comparados com outros modelos publicados anteriormente. Os resultados mostraram que o uso do descritor logSp em substituição aos descritores convencionais conduziu à obtenção de modelos simples com qualidades estatísticas e poder de predição superiores a de outros modelos mais complexos encontrados na literatura.
30

Développement de modèles QSPR pour la prédiction et la compréhension des propriétés amphiphiles des tensioactifs dérivés de sucre / Development of QSPR models for the prediction and better understanding of amphiphilic properties of sugar-based surfactants

Gaudin, Théophile 30 November 2016 (has links)
Les tensioactifs dérivés de sucres représentent la principale famille de tensioactifs bio-sourcés et constituent de bons candidats pour substituer les tensioactifs dérivés du pétrole puisqu'ils sont issus de ressources renouvelables et peuvent être autant, voire plus performants dans diverses applications, comme la formulation (détergents, cosmétiques,…), la récupération assistée du pétrole ou des minéraux, etc. Différentes propriétés amphiphiles permettent de caractériser la performance des tensioactifs dans de telles applications, comme la concentration micellaire critique, la tension de surface à la concentration micellaire critique, l'efficience et le point de Krafft. Prédire ces propriétés serait bénéfique pour identifier plus rapidement les tensioactifs possédant les propriétés désirées. Les modèles QSPR sont des outils permettant de prédire de telles propriétés, mais aucun modèle QSPR fiable dédié à ces propriétés n'a été identifié pour les tensioactifs bio-sourcés, et en particulier les tensioactifs dérivés de sucres. Au cours de cette thèse, de tels modèles QSPR ont été développés. Une base de données fiables est nécessaire pour développer tout modèle QSPR. Concernant les tensioactifs dérivés de sucres, aucune base de données existante n'a été identifiée pour les propriétés ciblées. Cela a donné suite à la construction de la première base de données de propriétés amphiphiles de tensioactifs dérivés de sucres, qui est en cours de valorisation. L'analyse de cette base de données a mis en évidence différentes relations empiriques entre la structure de ces molécules et leurs propriétés amphiphiles, et permis d'isoler des jeux de données les plus fiables et au protocole le plus homogène possibles en vue du développement de modèles QSPR. Après établissement d'une stratégie robuste pour calculer les descripteurs moléculaires constituant les modèles QSPR, qui s'appuie notamment sur des analyses conformationnelles des tensioactifs dérivés de sucres et des descripteurs des têtes polaires et chaînes alkyles, différents modèles QSPR ont été développés, validés, et leur domaine d'applicabilité spécifié, pour la concentration micellaire critique, la tension de surface à la concentration micellaire critique, l'efficience et le point de Krafft. Pour les trois premières propriétés, des modèles quantitatifs performants ont pu être obtenus. Si les descripteurs quantiques ont apporté un gain prédictif important pour la tension de surface à la concentration micellaire critique, et un léger gain pour la concentration micellaire critique, aucun gain n'a été observé pour l'efficience. Pour ces trois propriétés, des modèles simples basés sur des descripteurs constitutionnels des parties hydrophile et hydrophobe de la molécule (comme des décomptes d'atomes) ont aussi été obtenus. Pour le point de Krafft, deux arbres de décision qualitatifs, classant la molécule comme soluble ou insoluble dans l'eau à température ambiante, ont été proposés. Les descripteurs quantiques ont ici aussi apporté un gain en prédictivité, même si un modèle relativement fiable basé sur des descripteurs constitutionnels des parties hydrophile et hydrophobe de la molécule a aussi été obtenu. Enfin, nous avons montré comment ces modèles QSPR peuvent être utilisés, pour prédire les propriétés de nouvelles molécules avant toute synthèse dans un contexte de screening, ou les propriétés manquantes de molécules existantes, et pour le design in silico de nouvelles molécules par combinaison de fragments. / Sugar-based surfactants are the main family of bio-based surfactants and are good candidates as substitutes for petroleum-based surfactants, since they originate from renewable resources and can show as good as, or even better, performances in various applications, such as detergent and cosmetic formulation, enhanced oil or mineral recovery, etc. Different amphiphilic properties can characterize surfactant performance in such applications, like critical micelle concentration, surface tension at critical micelle concentration, efficiency and Kraft point. Predicting such properties would be beneficial to quickly identify surfactants that exhibit desired properties. QSPR models are tools to predict such properties, but no reliable QSPR model was identified for bio-based surfactants, and in particular sugar-based surfactants. During this thesis, such QSPR models were developed. A reliable database is required to develop any QSPR model. Regarding sugar-based surfactants, no database was identified for the targeted properties. This motivated the elaboration of the first database of amphiphilic properties of sugar-based surfactants. The analysis of this database highlighted various empirical relationships between the chemical structure of these molecules and their amphiphilic properties, and enabled to isolate the most reliable datasets with the most homogeneous possible protocol, to be used for the development of the QSPR models. After the development of a robust strategy to calculate molecular descriptors that constitute QSPR models, notably relying upon conformational analysis of sugar-based surfactants and descriptors calculated only for the polar heads and for the alkyl chains, different QSPR models were developed, validated, and their applicability domain defined, for the critical micelle concentration, the surface tension at critical micelle concentration, the efficiency and the Kraft point. For the three first properties, good quantitative models were obtained. If the quantum chemical descriptors brought a significant additional predictive power for the surface tension at critical micelle concentration, and a slight improvement for the critical micelle concentration, no gain was observed for efficiency. For these three properties, simple models based on constitutional descriptors of polar heads and alkyl chains of the molecule (like atomic counts) were also obtained. For the Krafft point, two qualitative decision trees, classifying the molecule as water soluble or insoluble at room temperature, were proposed. The use of quantum chemical descriptors brought an increase in predictive power for these decision trees, even if a quite reliable model only based on constitutional descriptors of polar heads and alkyl chains was also obtained. At last, we showed how these QSPR models can be used, to predict properties of new surfactants before synthesis in a context of computational screening, or missing properties of existing surfactants, and for the in silico design of new surfactants by combining different polar heads with different alkyl chain

Page generated in 0.0546 seconds