Global ETD Search

11	Dynamics Based Damage Detection of Plate-Type Structures Lu, Kan January 2005 (has links) No description available. Dynamic response Delamination PLates Scanning Laser Vibrometer Polyvinylidenefluoride sensors damage detection algorithms Uniform load surface
12	Stair-specific algorithms for identification of touch-down and foot-off when descending or ascending a non-instrumented staircase. Foster, Richard J., De Asha, Alan R., Reeves, N.D., Maganaris, C.N., Buckley, John 05 November 2013 (has links) yes / The present study introduces four event detection algorithms for defining touch-down and foot-off during stair descent and stair ascent using segmental kinematics. For stair descent, vertical velocity minima of the whole body center-of-mass was used to define touch-down, and foot-off was defined as the instant of trail limb peak knee flexion. For stair ascent, vertical velocity local minima of the lead-limb toe was used to define touch-down, and foot-off was defined as the local maxima in vertical displacement between the toe and pelvis. The performance of these algorithms was determined as the agreement in timings of kinematically derived events to those defined kinetically (ground reaction forces). Data were recorded while 17 young and 15 older adults completed stair descent and ascent trials over a four-step instrumented staircase. Trials were repeated for three stair riser height conditions (85 mm, 170 mm, and 255 mm). Kinematically derived touch-down and foot-off events showed good agreement (small 95% limits of agreement) with kinetically derived events for both young and older adults, across all riser heights, and for both ascent and descent. In addition, agreement metrics were better than those returned using existing kinematically derived event detection algorithms developed for overground gait. These results indicate that touch-down and foot-off during stair ascent and descent of non-instrumented staircases can be determined with acceptable precision using segmental kinematic data.
13	Event Detection Algorithm for Single Sensor Bladder Pressure Data Karam, Robert 27 January 2016 (has links) No description available. Computer Engineering event detection algorithms signal processing incontinence spinal cord injury urinary dysfunction
14	Applications of Fourier Analysis to Audio Signal Processing: An Investigation of Chord Detection Algorithms Lenssen, Nathan 01 January 2013 (has links) The discrete Fourier transform has become an essential tool in the analysis of digital signals. Applications have become widespread since the discovery of the Fast Fourier Transform and the rise of personal computers. The field of digital signal processing is an exciting intersection of mathematics, statistics, and electrical engineering. In this study we aim to gain understanding of the mathematics behind algorithms that can extract chord information from recorded music. We investigate basic music theory, introduce and derive the discrete Fourier transform, and apply Fourier analysis to audio files to extract spectral data. Fourier Analysis Audio Signal Processing Chord Detection Algorithms Fast Fourier Transform digital signal processing music theory Other Applied Mathematics
15	On the Detection of Retinal Vessels in Fundus Images Fang, Bin, Hsu, Wynne, Lee, Mong Li 01 1900 (has links) Ocular fundus image can provide information on pathological changes caused by local ocular diseases and early signs of certain systemic diseases. Automated analysis and interpretation of fundus images has become a necessary and important diagnostic procedure in ophthalmology. Among the features in ocular fundus image are the optic disc, fovea (central vision area), lesions, and retinal vessels. These features are useful in revealing the states of diseases in the form of measurable abnormalities such as length of diameter, change in color, and degree of tortuosity in the vessels. In addition, retinal vessels can also serve as landmarks for image-guided laser treatment of choroidal neovascularization. Thus, reliable methods for blood vessel detection that preserve various vessel measurements are needed. In this paper, we will examine the pathological issues in the analysis of retinal vessels in digital fundus images and give a survey of current image processing methods for extracting vessels in retinal images with a view to categorize them and highlight their differences and similarities. We have also implemented two major approaches using matched filter and mathematical morphology respectively and compared their performances. Some prospective research directions are identified. / Singapore-MIT Alliance (SMA) mathematical morphology matched filter retinal images blood vessel detection ocular fundus images segmentation tracking method edge detection algorithms
16	A Radon Space Approach To Multiresolution Tomographic Reconstruction And Multiscale Edge Detection Using Wavelets Goel, Anurag 11 1900 (has links) (PDF) No description available. Tomography Geometric Tomography Radon Transforms Image Reconstruction Wavelets MuItiscale Edge Detection Algorithms Radon Space Approach Inverse Radon Transform Biomedical Engineering
17	Inverse Sensitivity Methods In Linear Structural Damage Detection Using Vibration Data Venkatesha, S 03 1900 (has links) The thesis addresses the problem of structural damage detection using inverse sensitivity based methods. The focus here is on characterization with regard to identification, location, and, quantification of structural damage in linear time invariant (LTI) systems, using vibration data. The study encompasses both analytical and experimental methods. A suite of five algorithms for damage detection, namely, inverse eigensensitivity method that is refined to account for cross orthogonality between distinct modes, damping dependent eigensolutions, and sensitivity with respect to points of antiresonance and minima, inverse FRF method that includes refinements in terms of inclusion of second order sensitivity, response function method (RFM) based on first order Taylor’s expansion, a newly proposed inverse sensitivity method based on singular values of FRF matrix, and method based on response time histories, are presented. The scope of these methods vis-à-vis the need for model reduction, ability to deal with incomplete data, ill-posedness of governing equations and the need for regularization, sensitivity with respect to measurement noise, ability to identify damping characteristics, the highest and lowest magnitudes of changes in structural properties, and the ability to characterize systems with closely spaced natural frequencies that the methods can detect are discussed. The performance of proposed procedures is illustrated by considering a five degrees-of-freedom (dof) mass-spring-dashpot system and subsequently applied on three archetypal structural systems using analytical and experimental methods. In the examples presented, factors, such as, completeness of measured data in time and frequency, nature (proportional/non-proportional) and magnitude of damping, levels of changes in structural properties, modal truncations, number of governing equations for system parameters, and efficacy of regularization techniques are investigated. The study also highlights the difficulties in implementing the damage detection algorithm based on real life noisy vibration data. A comparative study on the suitability of each of these methods in locating and quantifying of different damage scenarios has been reported. A critical review of performance of the various methods is presented. The thesis concludes with a summary on the contributions made and also deliberates on future avenues for research and development in this area of research. Structural Analysis Vibration Analysis Inverse Sensitivity Structural Damage Detection - Algorithms Structural Dynamics Vibration Data Inverse Eigensensitivity Linear Time Invariant (LTI) Structural Engineering
18	Développement et caractérisation d'un système de sol piézoélectrique intelligent : application à la détection des chutes. / Development and characterization of a smart flooring system : application to fall detection Serra, Renan 30 June 2017 (has links) Cette thèse s’inscrit dans le domaine de la conception et de la réalisation de systèmes intelligents, associés à une technologie de capteurs sol. Elle vise à concevoir un outil automatisé et intelligent destiné à détecter principalement les chutes de personnes âgées en milieu hospitalier, afin de fournir une information supplémentaire au personnel soignant. Premièrement, les différentes technologies de capteurs appliquées aux revêtements de sol ont été étudiées. Parmi les technologies recensées, les capteurs piézoélectriques planaires en polymères ont été retenus pour l’élaboration du système intelligent. Ensuite, la caractérisation de la solution technique retenue a permis de définir les conditions et limites d’utilisation du capteur. Les aspects de robustesse et de durabilités ont été évalués à l’aide de méthodes développées à ces effets. Enfin, des algorithmes de détection ont été développés en vue de détecter les chutes, les pas et la présence de personnes sur des surfaces délimitées par le système. Des stratégies de classification basées sur la corrélation de Pearson, des algorithmes d’apprentissages ou des algorithmes à seuils ont été utilisés. / This thesis is part of the field of design and elaboration of smart systems combined with a flooring sensor technology. The main objective deal with the design of an automated and smart tool to detect falls of elderly people in hospitals or nursing homes, in order to provide additional information to healthcare workers. First, various sensor technologies applied to floor covering have been studied. Among the technologies identified, piezoelectric planar polymer sensors have been chosen for the development of the smart system. Then, the characterization of the validated technical solution allows to define conditions and limits of use of the sensor. The robustness and durability were evaluated using methods that were specifically developed to address these aspects. Finally, detection algorithms have been developed to detect falls, footsteps and presence of people on our sensors. Classification strategies based on Pearson’s correlation, machine learning algorithms or threshold based algorithm have been used. Capteurs Systèmes intelligents Caractérisation Traitement du signal Algorithmes de détection Sensors Smart systems Characterization Signal processing Detection algorithms, Footsteps, fall and presence detection 621.36
19	Development And Applications Of Computational Methods To Aid Recognition Of Protein Functions And Interactions Krishnadev, O 03 1900 (has links) (PDF) Protein homology detection has played a central role in the understanding of evolution of protein structures, functions and interactions. Many of the developments in protein bioinformatics can be traced back to an initial step of homology detection. It is not surprising then, that extension of remote homology detection has gained a lot of attention in the recent past. The explosive growth of genome sequences and the slow pace of experimental techniques have thrust computational analyses into the limelight. It is not surprising to see that many of the traditional experimental areas such as gene expression analysis, recognition of function and recognition of 3-D structure have been attempted effectively by computational approaches. The idea behind homology-based bioinformatics work is the fact that the hereditary mechanisms ensure that the parent generation gives rise to a very similar offspring generation. Since biological functions of proteins of an organism are product of expression of its genetic material, it follows that the genes of an organism should show conservation from one generation to another (with very few mutations if parent and offspring generation have to be nearly identical) Thus, if it can be established that two proteins have descended from a common ancestor, then it can be inferred that the biological functions of the two proteins could be very similar. Thus, homology-based information transfer from one protein to another has become a commonly used procedure in protein bioinformatics. The ability to recognize homologs of a protein solely from amino acid sequences has seen a steady increase in the last two decades. However, currently, still there are a large number of proteins of known amino acid sequence and yet unknown function . Thus, a major goal of current computational work is to extend the limits of remote homology detection to enable the functional characterization of proteins of unknown function. Since proteins do not work in isolation in a cell, it has become essential to understand the in vivo context of the function of a protein. For this purpose, it is essential to have an understanding of all the molecules that interact with a particular protein. Thus, another major area of bioinformatics has been to integrate biological information with protein-protein interactions to enable a better understanding of the molecular processes. Such attempts have been made successfully for the interaction network of proteins within an organism. The extension of the interaction network analysis to a host-pathogen scenario can lead to useful insights into pathophysiology of diseases. The work done as part of the thesis explores both the ideas mentioned above, namely, the extension of limits of remote homology detection and prediction of protein-protein interactions between a pathogen and its host. Since the work can logically be divided into two different areas though there is a connection, the thesis is organized as two parts. The first part of the thesis (comprising Chapters 2, 3, 4 and 5) describes the development and application of remote homology detection tools for function/structure annotation. The second part of the thesis (comprising of Chapters 6, 7, 8 and 9) describes the development and application of a homology-based procedure for detection of host-pathogen protein-protein interactions. Chapter 1 provides a background and literature survey in the areas of homology detection and prediction of protein-protein interactions. It is argued that homology-based information transfer is currently an important tool in the prediction and recognition of protein structures, functions and interactions. The development of remote homology detection methods and its effect on function recognition has been highlighted. Recent work in the area of prediction of protein-protein interactions using homology to known interaction templates is described and it is implied to be a successful approach for prediction of protein-protein interactions on a genome scale. The importance of further improvements in remote homology detection (as done in the first part of the thesis), is emphasized for annotation of proteins in newly sequenced genomes. The importance of application of homology detection methods in predicting protein-protein interactions across host-pathogen organisms is also explored. Chapter 2 analyzes the performance of the PSI-BLAST, one of the well-known and very effective approaches for recognition of related proteins, for remote homology detection. The chapter describes in detail the working of the PSI-BLAST algorithm and focuses on three parameters that determine the time required for searching in a large database, and also provide a ceiling for the sensitivity of the search procedure. The parameters that have been analyzed are the window size for two-hit method, the threshold for extension of an initial hit to dynamic programming and the extent of dependence on the query as encompassed in the profile generation step. The procedure followed for the analysis is to consider a large database of known evolutionary relationships (SCOP database was chosen for the analysis), and use the PSI-BLAST program at different values of three parameters to find out the effect on sensitivity (defined as the normalized number of correct SCOP superfamily relationships found in a search), and the time required for completion of the search. For the demonstration of the effect on the query dependence, a multiple sequence alignment (MSA) of a SCOP family (generated from all family sequences using ClustalW), was used with multiple queries to derive profiles in PSI-BLAST runs. The increase in sensitivity and the increase in time required for completion of each search were then monitored. The effect of changing the two PSI-BLAST internal parameters of score threshold for extension of word hits and the window size for the two-hit method do not result in a significant increase in sensitivity. Since PSI-BLAST uses the amino acid residues present in the query sequence to derive the Position Specific Scoring Matrix (PSSM) parameters, there is a strong query dependence on the sensitivity of each PSSM. Using multiple PSSMs derived from a single MSA can thus help overcome the query dependence and increase the sensitivity. In this Chapter such an approach, named as MulPSSM, has been demonstrated to have higher sensitivity than single profiles approach, (by up to two times more) in a benchmark dataset of 100 randomly chosen SCOP folds. Strategies to optimize sensitivity and the time required in searching MulPSSM have been explored and it is found that use of a non-redundant set of queries to generate MulPSSM can reduce the time required for each search while not affecting the sensitivity by a large degree. The application of the MulPSSM approach in function annotation of proteins in completely sequenced genomes was explored by searching genomic sequences in a MulPSSM database of Pfam families. The association of function to proteins has been assessed when both single profile per family database and MulPSSM database of families were used. It is found that in a comprehensive list of 291 genomes of Prokaryotes, 44 genomes of Eukaryotes and 40 genomes of Archea, that on an average MulPSSM is able to identify evolutionary relationships for 10% more proteins in a genome than single profiles-based approach. Such an enhancement in the recognition of evolutionary relationships, which has an implication in obtaining clues to functions, can help in more efficient exploration of newly sequenced genomes. Identification of evolutionary relationships involving some of the proteins of M. tuberculosis and M. leprae has been possible due to the use of multiple profiles search approach which is discussed in this chapter. The examples of annotations provided in the chapter include enzymes that are involved in glyco lipids synthesis which are vital for the survival of the pathogens inside the host and such annotations can help in expanding our knowledge of these processes. Chapter 3 describes the development and assessment of a sensitive remote homology detection method. The sensitivity of remote homology detection methods has been steadily increasing in the past decade and profile analysis has become a mainstay of such efforts. The profile is a probabilistic model of substitutions allowed at each position in a sequence family, and hence captures the essential features of a family. Alignment of two such profiles is thus considered to provide a more sensitive and accurate method than the alignment of two sequences. The performance of HMMs (Hidden Markov Models) has been shown to be higher than PSSMs (Position Specific Scoring Matrix). Thus, a profile-profile alignment using HMMs can in principle give the best possible sensitivity in remote homology detection. Many investigators have incorporated residue conservation and secondary structure information to align two HMMs, and such additional information has been demonstrated to provide better sensitivity in remote homology detection (for instance in the HHSearch program). The work presented in Chapter 3, extends the idea of incorporating additional information such as explicit hydrophobicity information, along with conservation and predicted secondary structure over a window of Multiple Sequence Alignment (MSA) columns in aligning HMMs. The new algorithm is named AlignHUSH (Alignment of HMMs Using Secondary structure and Hydrophobicity). The HMMs used in the work are derived from structural alignments using HMMER program and are taken from the publicly available superfamily database which provides HMMs for all the SCOP families. The HMMs are modified into two-state HMMs by collapsing the ‘insert’ and ‘delete’ states into a ‘non-match’ state in the AlignHUSH algorithm. The two state HMMs enables the use of dynamic programming methods and keeps intact the position-specific gap penalties. The two state HMMs can be more readily extended to alignment of PSSMs. The incorporation of secondary structure information is made using secondary structure predictions made using PSIPRED program. The hydrophobicity information is calculated using the Kyte Doolittle hydrophobicity values. The alignment is generated by scoring each position using the values present in a window of residues. The assessment of alignment accuracy is done by comparison to manually curated alignments present in the BaliBASE database. A detailed description of the optimization steps followed for obtaining the values for each score contribution (conservation, secondary structure and hydrophobicity) is provided. The assessment revealed that a high weightage to conservation score (18.0) and low weightage to the secondary structure score (1.5) and hydrophobicity (1.0) is optimal. The use of residue windows in alignment has been shown to dramatically increase the sensitivity (around 30% on a small dataset comprising 10% of total SCOP domains). The sensitivity of AlignHUSH algorithm in comparison to other HMM-HMM alignment methods HHSearch and PRC in an all-against-all comparison of SCOP 1.69 database demonstrates that AlignHUSH has better sensitivity than both HHSearch and PRC (approximately by 10% and 5% respectively). The alignment accuracy calculated as the ratio of correctly aligned residues and all alignment positions in BaliBASE alignments reveals that AlignHUSH algorithm provides an accuracy comparable or marginally higher than both HHSearch and PRC (25% for AlignHUSH and roughly 17% for both HHSearch and PRC). A few examples of structural relationships between SCOP families belonging to different folds and/or classes are presented in the chapter to illustrate the strength of AlignHUSH in detecting very remote relationships. Chapter 4 describes a database of evolutionary relationships identified between Pfam families. The grouping of Pfam families is important for obtaining better understanding on evolutionary relationships and in obtaining clues to functions of proteins in families of yet unknown function. Much effort has been taken by various investigators in bringing many proteins in the sequence databases within homology modeling distance with a protein of known structure. Structural genomics initiatives spend considerable effort in achieving this goal. The results from such experiments suggest that in many cases after the structure has been solved using X-ray crystallography or NMR methods, the protein is seen to have structural similarity to a protein of already known structure. Thus, an inability to detect such remote relationships severely impairs the efficiency of structural genomics initiatives. The development of the SUPFAM method was made earlier in the group to enable detection of distant relationships between Pfam families. In SUPFAM approach, relationships are detected by mapping the Pfam families to SCOP families. Further, using the implicit or explicit evolutionary relationship information present in the SCOP database relationships between Pfam families are detected. The work presented in this chapter is an improvement of previous development using the significantly more sensitive AlignHUSH method to uncover more relationships. The new database follows a procedure slightly different than the older SUPFAM database and hence is called SUPFAM+. The relative improvement brought by SUPFAM+ has been discussed in detail in the chapter. The methodology followed for the analysis is to first generate SUPFAM database by recognition of relationships between Pfam families and SCOP families using PSI BLAST / RPS BLAST. For the generation of SUPFAM+ database, recognition of relationships between Pfam families and SCOP families is done using AlignHUSH. The criteria are kept stringent at this stage to minimize the rate of false positives. In cases of a Pfam family mapping to two or more SCOP superfamilies, a semi-automated decision tree is used to assign the Pfam family to a single SCOP superfamily. Some of the Pfam families which remain without a mapping to a SCOP family are mapped indirectly to a SCOP family by identifying relationships between such Pfam families and other Pfam families which are already mapped to a SCOP family. In the final step, the Pfam families still without a SCOP family mapping are mapped onto one another to form ‘Potential New Superfamilies’ (PNSF), which are excellent targets for structural genomics since none of the proteins in such PNSFs have a recognizable homologue of known structure. The clustering of Pfam families into Superfamilies belonging to SCOP 1.69 version, were then queried to check if a structure has been solved for these Pfam families subsequent to the release of the SCOP 1.69 database. The latest SCOP database reveals that for close to 87 Pfam families a structure was solved which is at best related at a SCOP superfamily level with a family present in SCOP 1.69. An analysis of the mappings provided by SUPFAM+ database reveals that the mappings are correct in 85% of the cases at the SCOP superfamily level. An in-depth analysis revealed that among the rest of the cases, only one can be adjudged as an incorrect mapping. Many of the inconsistent mappings were found to be due to the absence of the SCOP fold in the SCOP 1.69 release, although interestingly the mapping provided by SUPFAM+ database shows structural similarity to the actual fold for the Pfam family found subsequently. A straightforward comparison with a similar database (Pfam Clans database) reveals that the SUPFAM+ database could suggest four times more pairwise relationships between Pfam families than the Pfam Clans database. Thus, since the structural mappings provided in the SUPFAM+ database are very accurate the relationships found in the database could help in function annotation of uncharacterized protein families (explored in Chapter 5). The accuracy of mapping would be similar for the PNSFs, and hence these clusters can be excellent targets for structural genomics initiatives. The classiﬁcation of families based on sequence/structural similarities can also be useful for function annotation of families of uncharacterized proteins, and such an idea is explored in the next chapter. Chapter 5 describes the attempts made to obtain clues to the structure and/or function of the DUF (Domain of Unknown Function) families present in the Pfam database. Currently, the DUF families populate around 21% of the Pfam database (2260 out of 10340). Thus, although homologues for each of the proteins in these families can be recognized in sequence databases, the homology does not provide obvious insight into the function of these proteins. The annotation of such difficult targets is a major goal of computational biologists in the post-genomic era. The development of a sensitive profile-profile alignment method as part of this thesis, gives an excellent opportunity to increase the number of annotations for proteins, especially in the DUF families, since a profile for these families exists in the Pfam database. The method followed for the analysis is similar to the SUPFAM+ development, and involved generation of Pfam profiles compatible with the AlignHUSH method. For the analysis presented in the chapter, relationships found between DUF families and SCOP families were analyzed. In benchmarks using the AlignHUSH method, it was found that a Z score of 5.0 gives a 10% error rate, and a Z score of 7.5 gives an error rate of 1%, and hence a minimum Z score cutoff of 7.5 was used in the analysis. A very high Z score in AlignHUSH is usually seen in cases, when sequence identity is also high, so a maximum Z score cutoﬀ of 12.0 was used to find DUF families which are difficult to annotate using other profile based methods (such as PSI-BLAST). For some of the DUF families, subsequent structure determination of one of the proteins had been reported in literature, and these cases were used to assess the accuracy of structural annotation using AlignHUSH. In other cases, fold recognition was done using the PHYRE method to ensure that the structure mappings are corroborated by fold recognition. In all cases studied, the alignment of the DUF family with the SCOP family was generated and queried for conservation of active site residues reported for each homologous SCOP family in the CSA (Catalytic Site Atlas) database. The assessment on 8 DUF families for which structure was solved subsequent to the SCOP release used in the analysis, reveals that in all cases, the correct structure was identified using the AlignHUSH procedure. In the eight cases of validated structure annotation, the conservation of active site residues was seen pointing to the effectiveness of AlignHUSH and its use in function annotation. The 27 cases in which a structure for any one of the proteins in the DUF family is not known, the fold recognition attempts suggest that in all cases, the results from fold recognition corroborate the suggestion made by AlignHUSH. The alignments of each of the DUF families with the suggested homologous SCOP family reveals that in many cases the active site residues are not conserved or are substituted by different residues. An in-depth analysis of some cases reveals that the non-conservation of residues occurs between two SCOP families in the same SCOP superfamily. Thus, although structure annotation can be reliably provided for all the DUF families studied, the exact biochemical function could be detected only for those cases in which active site conservation is seen even among distantly related families (such as two SCOP families in the same SCOP superfamily). The development and application of methods for remote homology detection has been made successfully and it has been demonstrated in the first part of the thesis that there is scope for extending the limits of remote homology detection. The use of sequence derived information in aligning profiles makes the procedure generally applicable and has been applied successfully for the case of structure/function recognition in the DUF families. In the next part of the thesis, a method for prediction of protein-protein interactions between a host and pathogen organism and its application to three groups of pathogens is presented. Chapter 6 describes the development of a procedure for prediction of protein-protein interactions (PPI) between a pathogen and its host organism. In the past, prediction of PPI has been attempted for proteins of a given organism. This was often approached by identifying proteins of the organism of interest that are homologous to two interacting proteins of another organism. A study of conservation of interactions as a function of sequence identity has been made in the past by various groups, which reveal that homologues sharing a sequence identity greater than about 30% interact in similar way. This fact can be used, along with a high quality database of protein-protein interactions to predict interactions between proteins of same organism. The work done in this thesis is one of the first attempts at extending the idea to the prediction of interactions between two different organisms. Homology of proteins from a pathogen and its host to proteins which are known to interact with each other would suggest that the proteins from pathogen and host can interact. The feasibility of such an interaction to occur under in vivo conditions need to be addressed for biologically meaningful predictions. These issues have been dealt with in this part of the thesis. One of the main steps in the procedure for the prediction of PPI is identification of homologues of pathogen and host proteins to interacting proteins listed in PPI databases. Two template PPI databases have been used in this work. One of the databases is the DIP database which provides a list of interactions based on genome-scale yeast-two-hybrid data or small scale experiments. The other database used is the iPfam database which provides interaction templates (Pfam families) based on protein complexes of known structure present in Protein Data Bank (PDB). Thus, the two databases are both comprehensive and are of high quality. The search for homologues in the DIP database was made using PSI-BLAST with stringent cutoffs for various parameters to minimize false positives. The search in iPfam database is done using RPS-BLAST and MulPSSM using stringent cutoffs. The cutoffs for the searches were fixed based on an assessment of conservation of putative interacting residues in the host and pathogen proteins as compared to the protein complexes of known structure. The predictions made are analyzed manually to assess the importance to the pathogenesis of the disease under consideration. In this chapter, in order to obtain an idea about robustness of this approach, PPI prediction was made for the phage-bacteria system and the herpes virus – human system which have been experimentally studied extensively and hence opportunities exist to compare the “predictions” with experimental results. The prediction of phage – bacteria interactions suggests that the gross biological features of the pathogenesis have been captured in the predictions. The GO (Gene Ontology) based annotations for the bacterial proteins predicted to interact suggests that the predictions involve proteins participating in DNA replication and protein synthesis. Many of the known interactions such as between the lambda phage repressor and RecA protein of bacteria were also ‘predicted’ in the analysis. A few novel interactions were predicted. For example interaction between a tail component protein and a protein of unknown function, YeeJ in E.coli has been predicted. The prediction of interactions between Herpes Virus 8 and human host and its comparison to a set of experimentally veriﬁed interactions reported in literature suggested that close to 50% of the known interactions were ‘predicted’ by the procedure followed. A few novel cases of interaction between the viral proteins and the p53 protein have also been made which might help in understanding the tumorigenesis of the viral disease. A comparison between the procedure followed in this thesis and the results from another genome-scale method (proposed by Andrej Sali and coworkers) suggests that although the proteins involved in predicted interactions from two methods may diﬀer, the functions of the proteins concerned suggested by GO annotations are highly correlated (greater than 98%). In the next few chapters, the prediction of interactions for diﬀerent host-pathogen systems is described. In the Chapter 7, the prediction of PPI between a Eukaryotic malarial pathogen, P.falciparum and its human host is described. The malarial parasite was chosen because of the extensive work reported in the literature on this pathogen in the recent years. Also, the gene expression patterns in the pathogen are highly correlated to the human tissue types with each stage of the pathogen occurring in a distinct tissue type. Thus, the biological context of the PPI can be explicitly assessed, which makes this example a well suited case for the procedure described in the Chapter 6 of this thesis. The pathogen is important from a medical perspective since there has been a recent emergence of P.falciparum induced malaria which is unresponsive to conventional drugs. Thus, studies of this parasite have gained an importance in the post genomic era. The difficulty in identifying homologues of many of the P.falciparum proteins makes this a challenging case study. Prediction of PPI between the malarial parasite and the human proteins has been approached in the same way as described in Chapter 6, with the cutoffs in homology searches kept stringent. However, in this case effective use of available additional biological data has been possible. The tissue specific expression information for human proteins has been obtained from the Atlas of Human transcriptome, and the NCBI GEO database. The pathogen stage-specific expression data has been obtained from multiple genome-scale experiments reported in the literature. The subcellular localization of both human and pathogen proteins has been predicted and hence this information is given low weightage in subsequent analysis. The prediction of PPI between malarial parasite and human, resulted in a total of more than 30,000 interactions which were compatible in an in vivo condition according to the expression data. Further reduction in the set of predicted interactions was made by incorporating the subcellular localization predictions (reduced to around 2000 interactions). Manual analysis of each of these interactions taking aid from literature on malarial parasites reveals that many of the known PPI are also ‘predicted’ in the analysis such as the interaction between SSP2 protein of P.falciparum and human ICAMs. For many proteins known to be important for pathogenesis, such as the RESA antigen, novel interactions were predicted that could help in better understanding of the pathogen. For some of the novel predicted interactions, such as that between the parasite Plasmepsin and human Spectrin, there exists circumstantial experimental evidence of interaction. Among many other novel interactions, the procedure used could predict interactions for 441 ‘hypothetical proteins’ of unknown function coded in the genome of the pathogen. The comprehensive list of predictions made using the procedure and an exploration of its biological significance can lead to novel hypothesis regarding the parthenogenesis of malaria and hence the work presented in this chapter can be helpful for further experimental exploration of the pathogen. The success of the procedure in predicting known interactions as well as novel interactions in a Eukaryotic pathogen suggests that the procedure developed is generally applicable. However it must be pointed out that in many cases of host-pathogen systems, such extensive expression and localization data may not be available, which makes the analysis difficult due to the large number of interactions predicted. One of such difficult cases is the interactions between Mycobacterial species and human host which is described in the next chapter. Chapter 8 describes the prediction of PPI between human and M.tuberculosis as well as three pathogens closely related to M.tuberculosis. Each of the pathogens has seen to re-emerge due to drug resistance and other causes. M.tuberculosis is becoming a global problem due to the limited number of drugs available to treat TB, which is susceptible to resistance. M.leprae has also shown signs of emergence of drug resistance, whereas C.diptheriae another pathogen studied in this chapter is seen as an emerging pathogen in Eastern Europe and in Indian subcontinent. Nocardial infections have also seen a rise due to the prevalence of AIDS which leads to susceptibility to the Nocardia infections. Thus, there is a need to understand further the pathogens in this important family, in order to better direct drug development. An important area for such endeavors is the mapping of the PPI between the pathogens and the human host. The procedure developed as part of the thesis can be used to predict such interactions. The procedure for prediction of interactions is the same as followed in Chapter 6 and involves identifications of homologues for the pathogen and host proteins among the proteins listed in the two template datasets DIP and iPfam using PSI-BLAST and RPS-BLAST (MulPSSM). In addition to the homology to the proteins involved in PPI, information / prediction on subcellular localization is used to assess biological significance of the interaction. An experimentally derived dataset of exported proteins in the M.tuberculosis was used to supplement the predictions from PSORTb database that provides subcellular localization for bacterial proteins. In order to minimize the number of predictions explored manually and to maximize the biological relevance of predicted interactions,, the predictions were made only for proteins present on the membrane of the pathogen or which are exported into the host. Prediction of interactions between human proteins and the proteins of four pathogens studied revealed that, some of the interactions which were known from earlier experiments were “predicted” by the present procedure. For example, the M.leprae exported Serine protease is known to interact with Ras-like proteins in the human host, and this interaction was ‘predicted’. Among other predicted interactions, several novel interactions have been suggested for proteins important for pathogenesis such as the MPT70 protein of M.tuberculosis which has been predicted to interact with TGFβ associated proteins which could play an important role in the pathogenesis of the disease. Some of the human proteins are known to play important role in pathogenesis, especially the toll-like receptors. A C.diphtheriae protein Mycosin, has been predicted to interact with the toll-like receptors raising the possibility that the Mycosins may play an important role in pathogenesis. Several hypothetical proteins of unknown function in the pathogens have been predicted to interact with human proteins. A few of such cases from M.tuberculosis have been described in the thesis and these proteins are predicted to interact with proteins involved in post-transnational modification in the human host. The prediction of novel interactions along with known interactions in four bacterial species thus points to the fact that the procedure can be used for almost any host-pathogen pair. In the next chapter, the application of the method to three other bacterial species belonging to the Enterobacteriaciae family is presented. Chapter 9 describes the analysis performed on the predicted interactions between human and three pathogens in the Enterobact Protein Functions Bioactive Proteins Computational Biochemistry Protein-Protein Interactions Protein Homology Detection - Algorithms Protein Bioinformatics Protein Sequence Homology (Biology) Remote Homology Detection Biochemistry
20	Web-based atmospheric nucleation data management and visualization Zhu, Kai 01 January 2012 (has links) Atmospheric nucleation is a process of phase transformation like liquid water transforming into solid or gas phase water, which serves as a significant impact on many atmospheric and technological processes. During the process of the atmospheric nucleation, certain 3D molecular models for atmospheric nucleation will be generated, which are main mixtures of water molecules and hexanol molecules. Analyzing these 3D molecular models can promote the understanding for the nucleation and growth of the particles and phases in a multi-component mixture, as well as for the changes in climate and weather. Therefore, the research for atmospheric nucleation can be transformed into the research for the 3D molecular visualizations and comparisons, which are the similarity calculations. Unfortunately, the research on understanding atmospheric nucleation processes is restricted due to the lack of efficient visual data exploration tools. In this paper, the issue of lacking efficient data visualization tools is tackled by implementing our own application to visualize the atmospheric nucleation. The similarity calculation for these 3D molecules is implemented in order to analyze and compare the atmospheric nucleation processes and molecular models. Admittedly, there are various 3D molecular similarity calculation algorithms, such as clique-detection algorithms and point matching, etc; however, these algorithms are specifically utilized in the fields of protein amino-acids and pharmacophore. Due to the large scale of the atmospheric nucleation data, GPU (Graphical Processing Units) is employed in order to significantly reduce the computation times. This is achieved by utilizing CUDA (Compute Uniform Device Architecture) technology which allows us to execute our algorithm in a parallel method. Furthermore, in this research, the knowledge of hypertree visualization is intended to be utilized to enhance the previously developed web-based visualization and analysis tool that allows remote users to effectively mine the wealth of particle-based nucleation simulation data. The research goal is to speed up knowledge discovery and improve users' productivity through effective data visualization technique and more friendly user interface design. Meanwhile, a feasible parallel computing solution is developed to overcome the slow response due to expensive large data pre-processing. The core research of my thesis is to calculate the similarity between the distinct 3D molecules. Engineering Physical Sciences and Mathematics

Search results