Global ETD Search

1	Predicting multibody assembly of proteins Rasheed, Md. Muhibur 25 September 2014 (has links) This thesis addresses the multi-body assembly (MBA) problem in the context of protein assemblies. [...] In this thesis, we chose the protein assembly domain because accurate and reliable computational modeling, simulation and prediction of such assemblies would clearly accelerate discoveries in understanding of the complexities of metabolic pathways, identifying the molecular basis for normal health and diseases, and in the designing of new drugs and other therapeutics. [...] [We developed] F²Dock (Fast Fourier Docking) which includes a multi-term function which includes both a statistical thermodynamic approximation of molecular free energy as well as several of knowledge-based terms. Parameters of the scoring model were learned based on a large set of positive/negative examples, and when tested on 176 protein complexes of various types, showed excellent accuracy in ranking correct configurations higher (F² Dock ranks the correcti solution as the top ranked one in 22/176 cases, which is better than other unsupervised prediction software on the same benchmark). Most of the protein-protein interaction scoring terms can be expressed as integrals over the occupied volume, boundary, or a set of discrete points (atom locations), of distance dependent decaying kernels. We developed a dynamic adaptive grid (DAG) data structure which computes smooth surface and volumetric representations of a protein complex in O(m log m) time, where m is the number of atoms assuming that the smallest feature size h is [theta](r[subscript max]) where r[subscript max] is the radius of the largest atom; updates in O(log m) time; and uses O(m)memory. We also developed the dynamic packing grids (DPG) data structure which supports quasi-constant time updates (O(log w)) and spherical neighborhood queries (O(log log w)), where w is the word-size in the RAM. DPG and DAG together results in O(k) time approximation of scoring terms where k << m is the size of the contact region between proteins. [...] [W]e consider the symmetric spherical shell assembly case, where multiple copies of identical proteins tile the surface of a sphere. Though this is a restricted subclass of MBA, it is an important one since it would accelerate development of drugs and antibodies to prevent viruses from forming capsids, which have such spherical symmetry in nature. We proved that it is possible to characterize the space of possible symmetric spherical layouts using a small number of representative local arrangements (called tiles), and their global configurations (tiling). We further show that the tilings, and the mapping of proteins to tilings on arbitrary sized shells is parameterized by 3 discrete parameters and 6 continuous degrees of freedom; and the 3 discrete DOF can be restricted to a constant number of cases if the size of the shell is known (in terms of the number of protein n). We also consider the case where a coarse model of the whole complex of proteins are available. We show that even when such coarse models do not show atomic positions, they can be sufficient to identify a general location for each protein and its neighbors, and thereby restricts the configurational space. We developed an iterative refinement search protocol that leverages such multi-resolution structural data to predict accurate high resolution model of protein complexes, and successfully applied the protocol to model gp120, a protein on the spike of HIV and currently the most feasible target for anti-HIV drug design. / text Spatial data structures Dynamic data structures Geometric optimization Fast Fourier methods Computational geometry Tiling Polyhedra molecular modeling Molecular surface Free energy Uncertainty quantification
2	Hierarchical Adaptive Quadrature and Quasi-Monte Carlo for Efficient Fourier Pricing of Multi-Asset Options Samet, Michael 11 July 2023 (has links) Efficiently pricing multi-asset options is a challenging problem in computational finance. Although classical Fourier methods are extremely fast in pricing single asset options, maintaining the tractability of Fourier techniques for multi-asset option pricing is still an area of active research. Fourier methods rely on explicit knowledge of the characteristic function of the suitably stochastic price process, allowing for calculation of the option price by evaluation of multidimensional integral in the Fourier domain. The high smoothness of the integrand in the Fourier space motivates the exploration of deterministic quadrature methods that are highly efficient under certain regularity assumptions, such as, adaptive sparse grids quadrature (ASGQ), and Randomized Quasi-Monte Carlo (RQMC). However, when designing a numerical quadrature method for most of the existing Fourier pricing approaches, two key factors affecting the complexity should be carefully controlled, (i) the choice of the vector of damping parameters that ensure the Fourier-integrability and control the regularity class of the integrand, (ii) the high-dimensionality of the integration problem. To address these challenges, in the first part of this thesis we propose a rule for choosing the damping parameters, resulting in smoother integrands. Moreover, we explore the effect of sparsification and dimension-adaptivity in alleviating the curse of dimensionality. Despite the efficiency of ASGQ, the error estimates are very hard to compute. In cases where error quantification is of high priority, in the second part of this thesis, we design an RQMC-based method for the (inverse) Fourier integral computation. RQMC integration is known to be highly efficient for high-dimensional integration problems of sufficiently regular integrands, and it further allows for computation of probabilistic estimates. Nonetheless, using RQMC requires an appropriate domain transformation of the unbounded integration domain to the hypercube, which may originate in a transformed integrand with singularities at the boundaries, and consequently deteriorate the rate of convergence. To preserve the nice properties of the transformed integrand,we propose a model-dependent domain transformation to avoid these corner singularities and retain the optimal efficiency of RQMC. The effectiveness of the proposed optimal damping rule, the designed domain transformation procedure, and their combination with ASGQ and RQMC are demonstrated via several numerical experiments and computational comparisons to the MC approach and the COS method. option pricing Fourier methods damping parameters adaptive sparse grid quadrature quasi-monte carlo importance sampling basket and rainbow options multivariate Levy models.
3	Interpretable Approximation of High-Dimensional Data based on the ANOVA Decomposition Schmischke, Michael 08 July 2022 (has links) The thesis is dedicated to the approximation of high-dimensional functions from scattered data nodes. Many methods in this area lack the property of interpretability in the context of explainable artificial intelligence. The idea is to address this shortcoming by proposing a new method that is intrinsically designed around interpretability. The multivariate analysis of variance (ANOVA) decomposition is the main tool to achieve this purpose. We study the connection between the ANOVA decomposition and orthonormal bases to obtain a powerful basis representation. Moreover, we focus on functions that are mostly explained by low-order interactions to circumvent the curse of dimensionality in its exponential form. Through the connection with grouped index sets, we can propose a least-squares approximation idea via iterative LSQR. Here, the proposed grouped transformations provide fast algorithms for multiplication with the appearing matrices. Through global sensitivity indices we are then able to analyze the approximation which can be used in improving it further. The method is also well-suited for the approximation of real data sets where the sparsity-of-effects principle ensures a low-dimensional structure. We demonstrate the applicability of the method in multiple numerical experiments with real and synthetic data.:1 Introduction 2 The Classical ANOVA Decomposition 3 Fast Multiplication with Grouped Transformations 4 High-Dimensional Explainable ANOVA Approximation 5 Numerical Experiments with Synthetic Data 6 Numerical Experiments with Real Data 7 Conclusion Bibliography / Die Arbeit widmet sich der Approximation von hoch-dimensionalen Funktionen aus verstreuten Datenpunkten. In diesem Bereich leiden vielen Methoden darunter, dass sie nicht interpretierbar sind, was insbesondere im Kontext von Explainable Artificial Intelligence von großer Wichtigkeit ist. Um dieses Problem zu adressieren, schlagen wir eine neue Methode vor, die um das Konzept von Interpretierbarkeit entwickelt ist. Unser wichtigstes Werkzeug dazu ist die Analysis of Variance (ANOVA) Zerlegung. Wir betrachten insbesondere die Verbindung der ANOVA Zerlegung zu orthonormalen Basen und erhalten eine wichtige Reihendarstellung. Zusätzlich fokussieren wir uns auf Funktionen, die hauptsächlich durch niedrig-dimensionale Variableninteraktionen erklärt werden. Dies hilft uns, den Fluch der Dimensionen in seiner exponentiellen Form zu überwinden. Über die Verbindung zu Grouped Index Sets schlagen wir dann eine kleinste Quadrate Approximation mit dem iterativen LSQR Algorithmus vor. Dabei liefern die vorgeschlagenen Grouped Transformations eine schnelle Multiplikation mit den entsprechenden Matrizen. Unter Zuhilfenahme von globalen Sensitvitätsindizes können wir die Approximation analysieren und weiter verbessern. Die Methode ist zudem gut dafür geeignet, reale Datensätze zu approximieren, wobei das sparsity-of-effects Prinzip sicherstellt, dass wir mit niedrigdimensionalen Strukturen arbeiten. Wir demonstrieren die Anwendbarkeit der Methode in verschiedenen numerischen Experimenten mit realen und synthetischen Daten.:1 Introduction 2 The Classical ANOVA Decomposition 3 Fast Multiplication with Grouped Transformations 4 High-Dimensional Explainable ANOVA Approximation 5 Numerical Experiments with Synthetic Data 6 Numerical Experiments with Real Data 7 Conclusion Bibliography info:eu-repo/classification/ddc/510 ddc:510 Approximation Multivariate Varianzanalyse Maschinelles Lernen

1

Page generated in 0.2011 seconds