• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Search and Analysis of the Sequence Space of a Protein Using Computational Tools

Dubey, Anshul 25 August 2006 (has links)
A new approach to the process of Directed Evolution is proposed, which utilizes different machine learning algorithms. Directed Evolution is a process of improving a protein for catalytic purposes by introducing random mutations in its sequence to create variants. Through these mutations, Directed Evolution explores the sequence space, which is defined as all the possible sequences for a given number of amino acids. Each variant sequence is divided into one of two classes, positive or negative, according to their activity or stability. By employing machine learning algorithms for feature selection on the sequence of these variants of the protein, attributes or amino acids in its sequence important for the classification into positive or negative, can be identified. Support Vector Machines (SVMs) were utilized to identify the important individual amino acids for any protein, which have to be preserved to maintain its activity. The results for the case of beta-lactamase show that such residues can be identified with high accuracy while using a small number of variant sequences. Another class of machine learning problems, Boolean Learning, was used to extend this approach to identifying interactions between the different amino acids in a proteins sequence using the variant sequences. It was shown through simulations that such interactions can be identified for any protein with a reasonable number of variant sequences. For experimental verification of this approach, two fluorescent proteins, mRFP and DsRed, were used to generate variants, which were screened for fluorescence. Using Boolean Learning, an interacting pair was identified, which was shown to be important for the fluorescence. It was also shown through experiments and simulations that knowing such pairs can increase the fraction active variants in the library. A Boolean Learning algorithm was also developed for this application, which can learn Boolean functions from data in the presence of classification noise.
2

Optimization of Recombination Methods and Expanding the Utility of Penicillin G Acylase

Loo, Bernard Liat Wen 02 November 2007 (has links)
Protein engineering can be performed by combinatorial techniques (directed evolution) and data-driven methods using machine-learning algorithms. The main characteristic of directed evolution (DE) is the application of an effective and efficient screen or selection on a diverse mutant library. As it is important to have a diverse mutant library for the success of DE, we compared the performance of DNA-shuffling and recombination PCR on fluorescent proteins using sequence information as well as statistical methods. We found that the diversity of the libraries DNA-shuffling and recombination PCR generates were dependent on type of skew primers used and sensitive to nucleotide identity levels between genes. DNA-shuffling and recombination PCR produced libraries with different crossover tendencies, suggesting that the two protocols could be used in combination to produce better libraries. Data-driven protein engineering uses sequence, structure and function data along with analyzed empirical activity information to guide library design. Boolean Learning Support Vector Machines (BLSVM) to identify interacting residues in fluorescent proteins and the gene templates were modified to preserve interactions post recombination. By site-directed mutagenesis, recombination and expression experiments, we validated that BLSVM can be used to identify interacting residues and increase the fraction of active proteins in the library. As an extension to the above experiments, DE was applied on monomeric Red Fluorescent Proteins to improve its spectral characteristics and structure-guided protein engineering was performed on penicillin G acylase (PGA), an industrially relevant catalyst, to change its substrate specificity.

Page generated in 0.0809 seconds