1 |
Étude descriptive longitudinale du sentiment d'appartenance envers l'école chez des élèves du secondaire des secteurs publics et privésBoily, Robert January 2002 (has links)
Thèse numérisée par la Direction des bibliothèques de l'Université de Montréal.
|
2 |
Development of bioinformatics methods for the analysis of large collections of transcription factor binding motifs : positional motif enrichment and motif clustering / Développement de méthodes bioinformatiques pour l'analyse de collections massives de motifs de liaison pour des facteurs transcriptionnels : enrichissement local et clustering de motifsCastro-Mondragon, Jaime 13 July 2017 (has links)
Les facteurs transcriptionnels (TF) sont des protéines qui contrôlent l'expression des gènes. Leurs motifs de liaison (TFBM, également appelés motifs) sont généralement représentés sous forme de matrices de scores spécifiques de positions (PSSM). L'analyse de motifs est utilisée en routine afin de découvrir des facteurs candidats pour la régulation d'un jeu de séquences d'intérêt. L'avénement des méthodes à haut débit a permis de détecter des centaines de motifs, qui sont disponibles dans des bases de données. Durant ma thèse, j'ai développé deux nouvelles méthodes et implémenté des outils logiciels pour le traitement de collections massives de motifs: matrix-clustering regroupe les motifs par similarité; position-scan détecte les motifs présentant des préférences de position relativement à une coordonnée de référence. Les méthodes que j'ai développées ont été évaluées sur base de cas d'études, et utilisées pour extraire de l'information interprétable à partir de différents jeux de données de Drosophila melanogaster et Homo sapiens. Les résultats démontrent la pertinence de ces méthodes pour l'analyse de données à haut débit, et l'intérêt de les intégrer dans des pipelines d'analyse de motifs. / Transcription Factors (TFs) are DNA-binding proteins that control gene expression. TF binding motifs (TFBMs, simply called “motifs”) are usually represented as Position Specific Scoring Matrices (PSSMs), which can be visualized as sequence logos. The advent of high-throughput methods has allowed the detection of thousands of motifs which are usually stored in databases. In this work I developed two novel methods and implemented software tools to handle large collection of motifs in order to extract interpretable information from high-throughput data: (i) matrix-clustering regroups motifs by similarity and offers a dynamic interface; (2) position-scan detects TFBMs with positional preferences relative to a given reference location (e.g. ChIP-seq peaks, transcription start sites). The methods I developed have been evaluated based on control cases, and applied to extract meaningful information from different datasets from Drosophila melanogaster and Homo sapiens. The results show that these methods enable to analyse motifs in high-throughput datasets, and can be integrated in motif analysis workflows.
|
3 |
School Connectedness and Mental Health in College StudentsDaley, Serena C. 30 July 2019 (has links)
No description available.
|
4 |
Experiments with Support Vector Machines and KernelsKohram, Mojtaba 21 October 2013 (has links)
No description available.
|
5 |
Computational Methods for Inferring Transcription Factor Binding SitesMorozov, Vyacheslav 11 October 2012 (has links)
Position weight matrices (PWMs) have become a tool of choice for the identification of transcription factor binding sites in DNA sequences. PWMs are compiled from experimentally verified and aligned binding sequences. PWMs are then used to computationally discover novel putative binding sites for a given protein. DNA-binding proteins often show degeneracy in their binding requirement, the overall binding specificity of many proteins is unknown and remains an active area of research. Although PWMs are more reliable predictors than consensus string matching, they generally result in a high number of false positive hits. A previous study introduced a novel method to PWM training based on the known motifs to sample additional putative binding sites from a proximal promoter area. The core idea was further developed, implemented and tested in this thesis with a large scale application. Improved mono- and dinucleotide PWMs were computed for Drosophila melanogaster. The Matthews correlation coefficient was used as an optimization criterion in the PWM refinement algorithm. New PWMs keep an account of non-uniform background nucleotide distributions on the promoters and consider a larger number of new binding sites during the refinement steps. The optimization included the PWM motif length, the position on the promoter, the threshold value and the binding site location. The obtained predictions were compared for mono- and dinucleotide PWM versions with initial matrices and with conventional tools. The optimized PWMs predicted new binding sites with better accuracy than conventional PWMs.
|
6 |
Computational Methods for Inferring Transcription Factor Binding SitesMorozov, Vyacheslav 11 October 2012 (has links)
Position weight matrices (PWMs) have become a tool of choice for the identification of transcription factor binding sites in DNA sequences. PWMs are compiled from experimentally verified and aligned binding sequences. PWMs are then used to computationally discover novel putative binding sites for a given protein. DNA-binding proteins often show degeneracy in their binding requirement, the overall binding specificity of many proteins is unknown and remains an active area of research. Although PWMs are more reliable predictors than consensus string matching, they generally result in a high number of false positive hits. A previous study introduced a novel method to PWM training based on the known motifs to sample additional putative binding sites from a proximal promoter area. The core idea was further developed, implemented and tested in this thesis with a large scale application. Improved mono- and dinucleotide PWMs were computed for Drosophila melanogaster. The Matthews correlation coefficient was used as an optimization criterion in the PWM refinement algorithm. New PWMs keep an account of non-uniform background nucleotide distributions on the promoters and consider a larger number of new binding sites during the refinement steps. The optimization included the PWM motif length, the position on the promoter, the threshold value and the binding site location. The obtained predictions were compared for mono- and dinucleotide PWM versions with initial matrices and with conventional tools. The optimized PWMs predicted new binding sites with better accuracy than conventional PWMs.
|
7 |
Analysis Of Protein Evolution And Its Implications In Remote Homology Detection And Function RecognitionGowri, V S 10 1900 (has links)
One of the major outcomes of a genome sequencing project is the availability of amino acid sequences of all the proteins encoded in the genome of the organism concerned. However, most commonly, for a substantial proportion of the proteins encoded in the genome no information in function is available either from experimental studies or by inference on the basis of homology with a protein of known function. Even if the general function of a protein is known, the region of the protein corresponding to the function might be a domain and there may be additional regions of considerable length in the protein with no known function. In such cases the information on function is incomplete.
Lack of understanding of the repertoire of functions of proteins encoded in the genome limits the utility of the genomic data. While there are many experimental approaches available for deciphering functions of proteins at the genomic scale, bioinformatics approaches form a good early step in obtaining clues about functions of proteins at the genomic scale (Koonin et al, 1998). One of the common bioinformatics approaches is recognition of function by homology (Bork et al, 1994). If the evolutionary relationship between two proteins, one with known function and the other with unknown function, could be established it raises the possibility of common function and 3-D structure for these proteins(Bork and Gibson, 1996). While this approach is effective its utility is limited by the ability of the bioinformatics approach to identify related proteins when their evolutionary divergence is high leading to low amino acid sequence similarity which is typical of two unrelated proteins (Bork and Koonin, 1998). Use of 3-D structural information, obtained by predictive methods such as fold recognition, has offered approaches towards increasing the sensitivity of remote homology detection 9e.g., Kelley et al, 2000; Shi et al, 2001; Gough et al, 2001).
The work embodied in this thesis has the general objective of analysis of evolution of structural features and functions of families of proteins and design of new bioinformatics approaches for recognizing distantly related proteins and their applications. After an introductory chapter, a few chapters report analysis of functional and structural features of homologous protein domains. Further chapters report development and assessment of new remote homology detection approaches and applications to the proteins encoded in two protozoan organisms. A further chapter is presented on the analysis of proteins involved in methylglyoxal detoxification pathways in kinetoplastid organisms.
Chapter I of the thesis presents a brief introduction, based on the information available in the literature, to protein structures, classification, methods for structure comparison, popular methods for remote homology detection and homology-based methods for function annotation.
Chapter 2 describes the steps involved in the update and improvements made in this database. In addition to the update, the domain structural families are integrated with the homologous sequences from the sequence databases. Thus, every family in PALI is enriched with a substantial volume of sequence information from proteins with no known structural information.
Chapter 3 reports investigations on the inter-relationships between sequence, structure and functions of closely-related homologous enzyme domain families.
Chapter 4 describes the investigations on the unusual differences in the lengths of closely-related homologous protein domains, accommodation of additional lengths in protein 3-D structures and their functional implications.
Chapter 5 reports the development and assessment of a new approach for remote homology detection using dynamic multiple profiles of homologous protein domain families.
Chapter 6 describes development of another remote homology detection approach which are multiple, static profiles generated using the bonafide members of the family. A rigorous assessment of the approach and strategies for improving the detection of distant homologues using the multiple profile approach are discussed in this chapter.
Chapter 7 describes results of searches made in the database of multiple family profiles (MulPSSM database) in order to recognize the functions of hypothetical proteins encoded in two parasitic protozoa.
Chapter 8 describes the sequence and structural analyses of two glyoxalase pathway proteins from the kinetoplastid organism Leishmania donovani which causes Leishmaniases. An alternate enzyme, which would probably substitute the glyoxalase pathway enzymes in certain kinetoplastid organisms which lack the glyoxalase enzymes are also discussed.
Chapter 9 summarises the important findings from the various analyses discussed in this thesis.
Appendix describes an analysis on the correlation between a measure of hydrophobicity of amino acid residues aligned in a multiple sequence alignment and residue depth in 3-D structures of proteins.
|
8 |
Computational Methods for Inferring Transcription Factor Binding SitesMorozov, Vyacheslav January 2012 (has links)
Position weight matrices (PWMs) have become a tool of choice for the identification of transcription factor binding sites in DNA sequences. PWMs are compiled from experimentally verified and aligned binding sequences. PWMs are then used to computationally discover novel putative binding sites for a given protein. DNA-binding proteins often show degeneracy in their binding requirement, the overall binding specificity of many proteins is unknown and remains an active area of research. Although PWMs are more reliable predictors than consensus string matching, they generally result in a high number of false positive hits. A previous study introduced a novel method to PWM training based on the known motifs to sample additional putative binding sites from a proximal promoter area. The core idea was further developed, implemented and tested in this thesis with a large scale application. Improved mono- and dinucleotide PWMs were computed for Drosophila melanogaster. The Matthews correlation coefficient was used as an optimization criterion in the PWM refinement algorithm. New PWMs keep an account of non-uniform background nucleotide distributions on the promoters and consider a larger number of new binding sites during the refinement steps. The optimization included the PWM motif length, the position on the promoter, the threshold value and the binding site location. The obtained predictions were compared for mono- and dinucleotide PWM versions with initial matrices and with conventional tools. The optimized PWMs predicted new binding sites with better accuracy than conventional PWMs.
|
Page generated in 0.0279 seconds