Global ETD Search

1	Identifying College Students’ Course-Taking Patterns In Stem Fields Bahrami, Fahimeh 01 January 2019 (has links) In spite of substantial investments in science, technology, engineering, and mathematics (STEM) education, low enrollment and high attrition rate among students in these fields remain an unmitigated challenge for higher education institutions. In particular, underrepresentation of women and minority students with STEM-related college degrees replicates itself in the makeup of the workforce, adding another layer to the challenge. While most studies examine the relationship between student characteristics and their outcomes, in this study, I take a new approach to understand academic pathways as a dynamic process of student curricular experiences that influence his/her decision about subsequent course-takings and major field of the study. I leverage data mining techniques to examine the processes leading to degree completion in STEM fields. Specifically, I apply Sequential Pattern Mining and Sequential Clustering to student transcript data from a four-year university to identify frequent academic major trajectories and also the most frequent course-taking patterns in STEM fields. I also investigate whether there are any significant differences between male and female students’ academic major and course-taking patterns in these fields. The findings suggest that non-STEM majoring paths are the most frequent academic pattern among students, followed by life science trajectories. Engineering and other hard science trajectories are much less frequent. The frequency of all STEM trajectories, however, declines over time as students switch to non-STEM majors. The switching rate from non-STEM to STEM fields overtime is, however, much lower. I also find that male and female students follow different academic pathways, and these gender-based differences are even more significant within STEM fields. Students’ course-taking patterns also suggest that taking engineering and computer science courses is predominantly a male course-taking behavior, while females are more likely to pursue academic pathways in life science. I also find that STEM introductory courses - particularly Calculus I, Calculus II and Chemistry I – are gateway courses, that serve as potential barriers to pursuing degrees in STEM-related fields for a large number of students who showed an initial interest in STEM courses. Female students were more likely to switch to non-STEM fields after taking these courses, while male students were more likely to drop out of college overall. In addition to the study’s findings on students’ academic pathways toward attaining a college degree in a STEM-related field, this study also shows how data mining techniques that leverage data about the sequence of courses students take can be used by higher education leaders and researchers to better understand students’ academic progress and explore how students navigate and interact with college curriculum. In particular, this study demonstrates how these analytic approaches might be used to design and structure more effective course taking pathways and develop interventions to improve student retention in STEM fields. Academic Pathway Course-taking Data mining STEM education Transcript Analysis Education Higher Education
2	Expanding the repertoire of bacterial (non-)coding RNAs Findeiß, Sven 02 May 2011 (has links) (PDF) The detection of non-protein-coding RNA (ncRNA) genes in bacteria and their diverse regulatory mode of action moved the experimental and bio-computational analysis of ncRNAs into the focus of attention. Regulatory ncRNA transcripts are not translated to proteins but function directly on the RNA level. These typically small RNAs have been found to be involved in diverse processes such as (post-)transcriptional regulation and modification, translation, protein translocation, protein degradation and sequestration. Bacterial ncRNAs either arise from independent primary transcripts or their mature sequence is generated via processing from a precursor. Besides these autonomous transcripts, RNA regulators (e.g. riboswitches and RNA thermometers) also form chimera with protein-coding sequences. These structured regulatory elements are encoded within the messenger RNA and directly regulate the expression of their “host” gene. The quality and completeness of genome annotation is essential for all subsequent analyses. In contrast to protein-coding genes ncRNAs lack clear statistical signals on the sequence level. Thus, sophisticated tools have been developed to automatically identify ncRNA genes. Unfortunately, these tools are not part of generic genome annotation pipelines and therefore computational searches for known ncRNA genes are the starting point of each study. Moreover, prokaryotic genome annotation lacks essential features of protein-coding genes. Many known ncRNAs regulate translation via base-pairing to the 5’ UTR (untranslated region) of mRNA transcripts. Eukaryotic 5’ UTRs have been routinely annotated by sequencing of ESTs (expressed sequence tags) for more than a decade. Only recently, experimental setups have been developed to systematically identify these elements on a genome-wide scale in prokaryotes. The first part of this thesis, describes three experimental surveys of exploratory field studies to analyze transcript organization in pathogenic bacteria. To identify ncRNAs in Pseudomonas aeruginosa we used a combination of an experimental RNomics approach and ncRNA prediction. Besides already known ncRNAs we identified and validated the expression of six novel RNA genes. Global detection of transcripts by next generation RNA sequencing techniques unraveled an unexpectedly complex transcript organization in many bacteria. These ultra high-throughput methods give us the appealing opportunity to analyze the complete RNA output of any species at once. The development of the differential RNA sequencing (dRNA-seq) approach enabled us to analyze the primary transcriptome of Helicobacter pylori and Xanthomonas campestris. For the first time we generated a comprehensive and precise transcription start site (TSS) map for both species and provide a general framework for the analysis of dRNA-seq data. Focusing on computer-aided analysis we developed new tools to annotate TSS, detect small protein-coding genes and to infer homology of newly detected transcripts. We discovered hundreds of TSS in intergenic regions, upstream of protein-coding genes, within operons and antisense to annotated genes. Analysis of 5’ UTRs (spanning from the TSS to the start codon of the adjacent protein-coding gene) revealed an unexpected size diversity ranging from zero to several hundred nucleotides. We identified and validated the expression of about 60 and about 20 ncRNA candidates in Helicobacter and Xanthomonas, respectively. Among these ncRNA candidates we found several small protein-coding genes that have previously evaded annotation in both species. We showed that the combination of dRNA-seq and computational analysis is a powerful method to examine prokaryotic transcriptomes. Experimental setups are time consuming and often combined with huge costs. Another limitation of experimental approaches is that genes which are expressed in specific developmental stages or stress conditions are likely to be missed. Bioinformatic tools build an alternative to overcome such restraints. General approaches usually depend on comparative genomic data and evolutionary signatures are used to analyze the (non-)coding potential of multiple sequence alignments. In the second part of my thesis we present our major update of the widely used ncRNA gene finder RNAz and introduce RNAcode, an efficient tool to asses local protein-coding potential of genomic regions. RNAz has been successfully used to identify structured RNA elements in all domains of life. However, our own experience and the user feedback not only demonstrated the applicability of the RNAz approach, but also helped us to identify limitations of the current implementation. Using a much larger training set and a new classification model we significantly improved the prediction accuracy of RNAz. During transcriptome analysis we repeatedly identified small protein-coding genes that have not been annotated so far. Only a few of those genes are known to date and standard proteincoding gene finding tools suffer from the lack of training data. To avoid an excess of false positive predictions, gene finding software is usually run with an arbitrary cutoff of 40-50 amino acids and therefore misses the small sized protein-coding genes. We have implemented RNAcode which is optimized for emerging applications not covered by standard protein-coding gene annotation software. In addition to complementing classical protein gene annotation, a major field of application of RNAcode is the functional classification of transcribed regions. RNA sequencing analyses are likely to falsely report transcript fragments (e.g. mRNA degradation products) as non-coding. Hence, an evaluation of the protein-coding potential of these fragments is an essential task. RNAcode reports local regions of high coding potential instead of complete protein-coding genes. A training on known protein-coding sequences is not necessary and RNAcode can therefore be applied to any species. We showed this with our analysis of the Escherichia coli genome where the current annotation could be accurately reproduced. We furthermore identified novel small protein-coding genes with RNAcode in this extensively studied genome. Using transcriptome and proteome data we found compelling evidence that several of the identified candidates are bona fide proteins. In summary, this thesis clearly demonstrates that bioinformatic methods are mandatory to analyze the huge amount of transcriptome data and to identify novel (non-)coding RNA genes. With the major update of RNAz and the implementation of RNAcode we contributed to complete the repertoire of gene finding software which will help to unearth hidden treasures of the RNA World. Transkriptom kleine RNAs Transkriptanalyse vergleichende Genomik transcriptome small RNAs transcript analysis comparative genomics ddc:000
3	Zavedení nových metod pro studium molekulárně genetické podstaty onemocnění CADASIL / Implementation of New Methods for Studying the Molecular Genetic Basis of the CADASIL Disease Hrubá, Monika January 2017 (has links) CADASIL is a neurodegenerative autosomal dominant hereditary disease with late onset. Main symptoms are migraines with aura, cerebral ischemic events, cognitive impairment and dementia. The disease is caused by a mutation in the NOTCH3 gene. The major mutation type changes the number of cysteine residues in the EGF-like repeats of the Notch3 protein. In Czech Republic, currently used methods for molecular genetic analysis of the CADASIL disease are Sanger sequencing and MLPA. But there are patients with CADASIL-like symptoms who were not confirmed by these methods. Therefore, the aim of this thesis was to implement transcript analysis by Sanger sequencing of cDNA PCR products and quantitative real-time PCR (qPCR) to analyze gross deletions and duplications to clarify the molecular genetic basis of the disease. By transcript analysis, the existence of the transcript variant X1 was experimentally confirmed in control samples. Moreover, the results from transcript analysis showed that non-typical missense mutation c.1725G>A (p.T575=) which does not directly change the number of cysteine residues, can cause the CADASIL disease via missplicing and subsequent causing deletion including cysteine residues. The other tested variants did not show any changes in the transcript level. The qPCR method did not...
4	Assertion and accommodation : a study of the assertive language in the conversations of school-age (5-13 years) girls Topham, Emma January 2018 (has links) This study aimed to investigate the use of accommodation of assertive utterances (AUs) in the conversations of 49 girls aged 5;0-13;1. Based on the findings of earlier research that the use of such language is more closely related to age than to gender, it was predicted speakers would accommodate their use of and response to assertive utterances as a result of their partner's age. Naturalistic language from these speakers was collected over a year, and evidence of accommodation was observed in all speakers. Fewer AUs were used with younger speakers compared to older ones, and those used with younger girls were more likely to be produced with the sole purpose of controlling the hearer's behaviour. In addition, AUs were more likely to be complied with, or accepted, when they were produced by older girls. Given what is known about the types of language used by powerful/powerless individuals, it appears that these speakers consider age to be an indicator of status. A particularly interesting finding was that it was the age of a speaker in relation to other members of the conversation that influenced their use of and response to AUs, rather than the age of the speaker alone.
5	Expanding the repertoire of bacterial (non-)coding RNAs Findeiß, Sven 03 July 2011 (has links) The detection of non-protein-coding RNA (ncRNA) genes in bacteria and their diverse regulatory mode of action moved the experimental and bio-computational analysis of ncRNAs into the focus of attention. Regulatory ncRNA transcripts are not translated to proteins but function directly on the RNA level. These typically small RNAs have been found to be involved in diverse processes such as (post-)transcriptional regulation and modification, translation, protein translocation, protein degradation and sequestration. Bacterial ncRNAs either arise from independent primary transcripts or their mature sequence is generated via processing from a precursor. Besides these autonomous transcripts, RNA regulators (e.g. riboswitches and RNA thermometers) also form chimera with protein-coding sequences. These structured regulatory elements are encoded within the messenger RNA and directly regulate the expression of their “host” gene. The quality and completeness of genome annotation is essential for all subsequent analyses. In contrast to protein-coding genes ncRNAs lack clear statistical signals on the sequence level. Thus, sophisticated tools have been developed to automatically identify ncRNA genes. Unfortunately, these tools are not part of generic genome annotation pipelines and therefore computational searches for known ncRNA genes are the starting point of each study. Moreover, prokaryotic genome annotation lacks essential features of protein-coding genes. Many known ncRNAs regulate translation via base-pairing to the 5’ UTR (untranslated region) of mRNA transcripts. Eukaryotic 5’ UTRs have been routinely annotated by sequencing of ESTs (expressed sequence tags) for more than a decade. Only recently, experimental setups have been developed to systematically identify these elements on a genome-wide scale in prokaryotes. The first part of this thesis, describes three experimental surveys of exploratory field studies to analyze transcript organization in pathogenic bacteria. To identify ncRNAs in Pseudomonas aeruginosa we used a combination of an experimental RNomics approach and ncRNA prediction. Besides already known ncRNAs we identified and validated the expression of six novel RNA genes. Global detection of transcripts by next generation RNA sequencing techniques unraveled an unexpectedly complex transcript organization in many bacteria. These ultra high-throughput methods give us the appealing opportunity to analyze the complete RNA output of any species at once. The development of the differential RNA sequencing (dRNA-seq) approach enabled us to analyze the primary transcriptome of Helicobacter pylori and Xanthomonas campestris. For the first time we generated a comprehensive and precise transcription start site (TSS) map for both species and provide a general framework for the analysis of dRNA-seq data. Focusing on computer-aided analysis we developed new tools to annotate TSS, detect small protein-coding genes and to infer homology of newly detected transcripts. We discovered hundreds of TSS in intergenic regions, upstream of protein-coding genes, within operons and antisense to annotated genes. Analysis of 5’ UTRs (spanning from the TSS to the start codon of the adjacent protein-coding gene) revealed an unexpected size diversity ranging from zero to several hundred nucleotides. We identified and validated the expression of about 60 and about 20 ncRNA candidates in Helicobacter and Xanthomonas, respectively. Among these ncRNA candidates we found several small protein-coding genes that have previously evaded annotation in both species. We showed that the combination of dRNA-seq and computational analysis is a powerful method to examine prokaryotic transcriptomes. Experimental setups are time consuming and often combined with huge costs. Another limitation of experimental approaches is that genes which are expressed in specific developmental stages or stress conditions are likely to be missed. Bioinformatic tools build an alternative to overcome such restraints. General approaches usually depend on comparative genomic data and evolutionary signatures are used to analyze the (non-)coding potential of multiple sequence alignments. In the second part of my thesis we present our major update of the widely used ncRNA gene finder RNAz and introduce RNAcode, an efficient tool to asses local protein-coding potential of genomic regions. RNAz has been successfully used to identify structured RNA elements in all domains of life. However, our own experience and the user feedback not only demonstrated the applicability of the RNAz approach, but also helped us to identify limitations of the current implementation. Using a much larger training set and a new classification model we significantly improved the prediction accuracy of RNAz. During transcriptome analysis we repeatedly identified small protein-coding genes that have not been annotated so far. Only a few of those genes are known to date and standard proteincoding gene finding tools suffer from the lack of training data. To avoid an excess of false positive predictions, gene finding software is usually run with an arbitrary cutoff of 40-50 amino acids and therefore misses the small sized protein-coding genes. We have implemented RNAcode which is optimized for emerging applications not covered by standard protein-coding gene annotation software. In addition to complementing classical protein gene annotation, a major field of application of RNAcode is the functional classification of transcribed regions. RNA sequencing analyses are likely to falsely report transcript fragments (e.g. mRNA degradation products) as non-coding. Hence, an evaluation of the protein-coding potential of these fragments is an essential task. RNAcode reports local regions of high coding potential instead of complete protein-coding genes. A training on known protein-coding sequences is not necessary and RNAcode can therefore be applied to any species. We showed this with our analysis of the Escherichia coli genome where the current annotation could be accurately reproduced. We furthermore identified novel small protein-coding genes with RNAcode in this extensively studied genome. Using transcriptome and proteome data we found compelling evidence that several of the identified candidates are bona fide proteins. In summary, this thesis clearly demonstrates that bioinformatic methods are mandatory to analyze the huge amount of transcriptome data and to identify novel (non-)coding RNA genes. With the major update of RNAz and the implementation of RNAcode we contributed to complete the repertoire of gene finding software which will help to unearth hidden treasures of the RNA World. info:eu-repo/classification/ddc/000 ddc:000
6	Implication des remaniements géniques dans l'inactivation des gènes de prédisposition au cancer du sein / Germline large rearrangements in the inactivation of genes implied in breast cancer predisposition Rouleau, Etienne 07 December 2011 (has links) Parmi les cancers du sein, 5 à 10% serait associé à une prédisposition génétique familiale. La prise en charge des patients prédisposés nécessite une bonne définition des risques de cancer. L’identification de l’altération moléculaire causale dans chacune de ces familles est donc un enjeu essentiel dans la prise en charge médicale. Deux gènes, BRCA1 et BRCA2, sont associés à une prédisposition majeure au cancer du sein et de l’ovaire depuis le milieu des années 1990, expliquant environ 15% des formes héréditaires. L’analyse moléculaire de ces deux gènes est désormais réalisée en routine pour la recherche de variations nucléotidiques et plus récemment de remaniements géniques ce qui a permis d’améliorer le taux de détection de mutations délétères. Cependant, pour près de 85% des familles avec une agrégation familiale ou un âge anormalement jeune de cancer du sein, aucune mutation délétère n’a pu être mise en évidence. Dans ce contexte, mon travail de thèse a eu pour objectif de tester plusieurs hypothèses permettant d’expliquer les risques de cancer du sein observés chez des familles montrant l’absence de mutation des gènes BRCA1 et BRCA2. Nous avons ainsi recherché des mécanismes d’altération rarement explorés pour les gènes BRCA1 et BRCA2, et enfin analysé d’autres gènes candidats dont le gène CDH1 et huit autres gènes impliqués dans la réparation de l’ADN. Nous avons pu mieux caractériser des remaniements sur les gènes BRCA1 et BRCA2. Enfin, nous avons pu évaluer l’impact de variants de signification inconnue et des réarrangements détectés par l’étude de leurs transcrits. Dans un premier temps, nous avons mis en place et validé de nouvelles approches techniques de détection et de caractérisation : la CGH-array dédiée, la qPCR-HRM et le peignage moléculaire. Ces techniques ont ensuite été utilisées pour étudier les remaniements géniques et leur fréquence pour onze gènes candidats à la prédisposition au cancer du sein à partir de 472 familles négatives aux mutations délétères BRCA1 et BRCA2. Parmi ces 11 gènes, nous pouvons conclure que les remaniements géniques détectés concernent principalement les gènes BRCA1 et BRCA2, et à un moindre degré le gène CHEK2. En appliquant ces techniques, nous avons pu décrire de nouveaux événements, deux larges délétions et une duplication intronique, pour les gènes CDH1 et BARD1, ouvrant de nouvelles perspectives sur l’étude des transcrits alternatifs. Nous avons en particulier pu décrire la grande diversité des réarrangements délétères en 5’ du gène BRCA1. L’enjeu est ensuite l’interprétation de ces événements. Notre étude des transcrits a permis de décrire un variant exonique d’épissage entraînant une délétion de l’exon 23 au niveau du transcrit BRCA1. Nous avons aussi validé la pathogénicité d’un réarrangement en phase de l’exon 3 de BRCA2 par une étude quantitative du transcrit et une évaluation de la coségrégation. Au final, moins de 1% de nouveaux remaniements ont été mis en évidence. Ce travail est riche d’enseignement pour les nouvelles investigations à mettre en place pour les familles prédisposées. En dehors de la technique d’identification, il est nécessaire de développer des stratégies de validation basées principalement sur la quantification des effets de ces altérations au niveau de l’ARN et des protéines. Cependant, il manque encore de nombreux chaînons pour expliquer l’héritabilité des cancers du sein. Les études sur les nouveaux gènes candidats et l’avènement des techniques de séquençage pangénome à haut débit, devraient permettre d’avoir une meilleure vision des phénomènes pathobiologiques liés à la prédisposition au cancer du sein. / Five to 10% of breast cancers are linked to a genetic predisposition. The management of patients at risk requires a good definition in the risk of cancer. The identification of causal molecular alterations in each of these families is a key issue in medical care. Two genes, BRCA1 and BRCA2, are related with the greatest susceptibility to breast cancer and ovarian cancer since the mid-1990s, accounting for about 15% of hereditary forms. Molecular analysis of these two genes is now routinely performed for the detection of nucleotide variations and more recently large rearrangements which have improved the detection rate of deleterious mutations. However, for more than 85% of families, no mutation explains familial aggregation or unusual young age of breast cancer onset. In this context, my thesis aimed at testing several hypotheses to explain the risks of breast cancer observed in families without any identified mutations in the BRCA1 and BRCA2 genes. We investigated some mechanisms of genic rearrangements rarely explored for BRCA1 and BRCA2 genes, and finally investigated other candidate genes, especially CDH1 gene and eight other genes involved in double-strand DNA repair. We have better characterized some rearrangements in the BRCA1 and BRCA2 genes. Finally, we applied RNA quantitative approaches to better assess the impact from variants of unknown significance and detected rearrangements. Initially, we developed and validated new technical approaches for detection and characterization such as dedicated CGH-array, qPCR-HRM and molecular combing. Rare large germline rearrangements and their frequency in eleven candidate genes for susceptibility to breast cancer were studied among 472 families negative by routine testing for BRCA1 and BRCA2 genes. Of these 11 genes, we conclude that genic rearrangements are found then mainly in the BRCA1 and BRCA2 genes, and to a lesser extent in the CHEK2 gene. We were able to describe two large intronic deletions and one duplication for the CDH1 and BARD1 genes, opening new perspectives on the regulation of their alternative transcript. In particular, we described the wide diversity of new rearrangements involving the 5' region of the BRCA1 gene. Then, it is necessary to validate and interpret those new events. Our transcript analysis described a new exonic variant causing the splice deletion of exon 23 in BRCA1 gene. We have developed tools to validate an in-frame large rearrangement of BRCA2 exon 3 with some transcript quantitative approaches and disease cosegregation.Finally, less than 1% of new rearrangements have been identified. This work is instructive for further investigations to establish molecular etiology in those families with breast cancer predisposition. Not only by applying new technologies, it is necessary to develop other strategies based primarily on quantifying effects of these alterations on transcription and traduction. However, it still lacks many links to explain the heritability of breast cancer. The combination of new candidate genes studies and the advent of high-throughput sequencing are expected to give a better vision of pathobiological phenomena related to the breast cancer predisposition. Oncogénétique Prédisposition au cancer du sein BRCA1 BRCA2 QPCR-HRM CGH-array dédiée Peignage moléculaire Étude de transcript Gène candidat Oncogenetic Breast cancer predisposition BRCA1 BRCA2 QPCR-HRM Dedicated CGH-array Molecular combing Transcript analysis Candidate gene

1

Page generated in 0.0778 seconds