• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 417
  • 85
  • 84
  • 43
  • 20
  • 16
  • 14
  • 9
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • Tagged with
  • 826
  • 359
  • 251
  • 157
  • 130
  • 108
  • 105
  • 87
  • 83
  • 74
  • 67
  • 64
  • 61
  • 58
  • 57
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
571

Genetische Faktoren der humanen Cholesterinbiosynthese

Baier, Jan 10 October 2012 (has links)
Background: Genome-wide association studies (GWAs) have identified almost one hundred genetic loci associated with variances in human blood lipid phenotypes including very low-density lipoprotein cholesterol, low-density lipoprotein cholesterol, high-density lipoprotein cholesterol, total cholesterol and triglycerides. Nevertheless the revealed loci only explain a small fraction of heritability and therefore a subtile phenotype of cholesterol homoestasis was examined in our study for the very first time. Methods and Results: Using a multi-stage approach of a GWA, firstly, a genome-wide analysis (Affymetrix 500K GeneChip) for serum lanosterol and serum total cholesterol using LC-MS/MS was conducted in 1495 participants of the KORA-S3/F3 cohort with subsequent replication in two additional independent samples of the the KORA-S3/F3 cohort (n = 1157) and CARLA cohort (n = 1760). Two genetic variants, SNP rs7703051 and rs17562686, in the HMGCR locus were significantly associated with serum lanosterol and showed similar effects of elevated serum lanosterol for each minor allele (combined n = 4412: p = 1,4 x 10-10, +7,1% and p = 4,3 x 10-6, +7,8%). Furthermore, rs7703051 showed a nominal statistical significance to serum cholesterol (p = 0,04). A combined analysis of both SNPs demonstrated that observed associations of rs17562686 can be partly explained by LD with rs7703051 being the primary polymorphism in that study. Nevertheless, rs17562686 shows consistent independent effects on serum lanosterol, thus being associated to a lipid phenotype for the very first time. The following SNP-fine mapping of the HMGCR locus was carried out in the CARLA cohort with subsequent validation in the LE-Heart cohort (n = 1895). The recently published SNP rs3846662 being in tight LD with rs7703051 could be associated with variances of serum lanosterol in both cohorts and functional in vivo studies of gen expression using qRT-PCR assays demonstrated a highly significant association of higher expression of alternatively spliced HMGCR mRNA lacking exon 13 with homozygosity for the rs3846662 major allele in 51 human liver samples (p < 0,01) and 958 human PBMCs (p = 2,1 x 10-7). The overall HMGCR gen expression was not affected. Further investigation of in vivo HMG-CoA reductase enzyme activity in both human samples (n = 48 and n = 55) using anionic exchange column chromatography and scintillation counting of [3-14C]-HMG-CoA and [5-3H]-mevalonolacton did not show any significant results. In addition there was not any association in the LE-Heart cohort between these SNPs and the development of CAD. Finally, rs7703051 could be replicated for already published total cholesterol (combined n = 4412) and rs3846662 for LDL-cholesterol (LE-Heart n = 1895). Since fine mapping in CARLA showed several SNPs throughout the HMGCR locus being in LD with rs17562686 we performed a DNA sequencing of the extended 5´-HMGCR promotor region in six human liver samples. A unknown SNP was discovered in the promotor but could not be associated with any of the examined phenotypes mentioned above. The minor allele of SNP rs5909 situated next to the stop codon and being in high LD with rs17562686 was associated with elevated serum lanosterol and slightly reduced HMGCR gen expression, but further studies including the above mentioned as well as measurement of 3’-UTR transcript lengths using qRT-PCR assays did not produce significant results. Conclusion: The phenotype serum lanosterol could be associated with genetic polymorphisms (e.g. rs7703051) in the HMGCR locus. Therefore already published associations of HMGCR with total cholesterol and LDL-cholesterol can be explained by variances of cholesterol homeostasis. The SNP rs17562686 could be associated with a phenotype of human blood lipids for the very first time. Subsequent gen expression analyses demonstrated a highly significant association of rs3846662 with variant patterns of HMGCR alternative splicing. A significant effect of alternatively spliced protein on enzyme activity and a association of these SNPs with CAD could not be shown.
572

Split Intein Applications for Downstream Purification and Protein Conjugation

Galiardi, Jackelyn 05 October 2021 (has links)
No description available.
573

Functional Analyses of Human DDX41 and LUC7-like Proteins Involved in Splicing Regulation and Myeloid Neoplasms

Daniels, Noah James 23 May 2022 (has links)
No description available.
574

Identifizierung und Charakterisierung einer alternativ gespleißten mRNA der Interleukin-4 Rezeptor alpha-Kette und Untersuchung der biologischen Funktion der verkürzten Rezeptorvariante

Möricke, Anja 15 April 2002 (has links)
Alternatives mRNA-Splicing ist ein häufig beobachtetes Phänomen, das es der Zelle ermöglicht, unterschiedliche Proteine aus einem Gen zu generieren. In den letzten Jahren wurden immer mehr alternativ gespleißte Transkripte entdeckt, und einigen der daraus resultierenden Protein-Isoformen konnten geänderte biologische Funktionen zugeordnet werden. In dieser Arbeit ist erstmals ein alternativ gespleißtes Transkript der Interleukin-4 Rezeptor alpha (IL-4R-alpha) Kette beschrieben. Dieser mRNA Splice-Variante, genannt IL-4R-alpha-IT, fehlt im membranproximalen Bereich der zytoplasmatischen Domäne ein komplettes Exon. Dies führt zur Verschiebung des Leserasters und so zur Entstehung eines vorzeiten Stop-Codons. Der resultierenden Protein-Isoform fehlt der größte Teil der intrazellulären Kette mit den dort enthaltenen, für die Signaltransduktion essentiellen Domänen. Die Untersuchung der biologischen Funktion der Rezeptor-Varianten in einem geeigneten Zellsystem der Maus zeigte, daß die Splice-Variante IL-4R-alpha-IT keine Proliferation der Zellen vermitteln und auch den Übergang der Zellen in die Apoptose nicht verhindern kann. Bei der Quantifizierung der Expression von IL-4R-alpha-IT-mRNA in Relation zum IL-4R-alpha voller Länge mit einer kompetitiven RT-PCR an Knochenmark und peripheren Blutlymphozyten von Kindern mit ALL zeigte sich zunächst ein irreführender Unterschied zwischen Proben von Kindern mit ALL-Ersterkrankung und Rezidiv. Weitere Untersuchungen ergaben jedoch, daß der Zeitraum zwischen Abnahme und Aufarbeitung des Untersuchungsmaterials für diesen scheinbaren Zusammenhang verantwortlich war. Während direkt nach Abnahme aufgearbeitetes Untersuchungsmaterial eine nur niedrige relative Expression der Splice-Variante zeigte, nahm diese bei verzögerter Aufarbeitung drastisch zu. Diese Beobachtung wurde experimentell an Proben gesunder Probanden wiederholt bestätigt. Interessanterweise konnte derselbe Effekt in unterschiedlicher Ausprägung auch bei Splice-Variante anderer Zytokine und -Rezeptoren wie IL-7, IL-7R und beta-C beobachtet werden. mRNA-Stabilitäts-Assays und die Bestimmung der einzelnen Transkripte mit einer semiquantitativen RT-PCR zeigten, daß es tatsächlich zu einer absoluten Hochregulation der IL-4R-alpha-IT-mRNA in den verzögerte aufgearbeiteten Proben kommt. Wurden die Zellen wieder in Kultur genommen, war dies innerhalb weniger Stunden reversibel. Desweiteren scheinen auch unterschiedlichen mRNA-Stabilitäten eine Rolle zu spielen. / Alternative pre-mRNA splicing is a widespread mechanism contributing to the diversity of gene expression. The number of newly detected alternatively spliced transcripts has continuously risen, and distinct biological functions have been attributed to some protein isoforms resulting from these mRNA variants. We report on the detection of a novel alternatively spliced transcript of the human interleukin-4 receptor alpha (IL-4R-alpha) chain, which has been called IL-4R-alpha-IT mRNA. A premature stop codon due to omission of one exon in the membrane-proximal region of the cytoplasmic domain leads to an mRNA variant, which encodes an intracellular truncated receptor protein lacking domains which are essential for signal transduction. The investigation of the biological function of the IL-4Ra splice variants in a suitable mouse cell system showed, that the truncated receptor variant is not able to mediate cell proliferation or prevention of apoptosis. Bone marrow and peripheral blood samples from children with acute lymphoblastic leukemia were analyzed for the expression of IL-4R-alpha-IT mRNA relative to the full-length receptor transcript by competitive RT-PCR. Initially, there was found a difference of IL-4R-alpha-IT mRNA expression in patients with initial ALL versus relapsed ALL. However, this difference turned out to be due to the time interval between collection and preparation of samples. While freshly isolated material was associated with low levels of IL-4R-alpha-IT mRNA, samples with a longer period until cell preparation exhibited a drastic increase of IL-4R-alpha-IT mRNA levels. The same results were obtained for peripheral blood samples from healthy donors by imitating a prolonged time of transport until cell preparation. Interestingly, a similar effect could be demonstrated for splice variants of other cytokine receptors and cytokines (beta-C, IL-7R, and IL-7), although to different extents. mRNA stability assays and semiquantitative RT-PCR specific for IL-4Ra or IL-4R-alpha-IT, respectively, indicated that the expression of IL-4R-alpha-IT mRNA increases absolutely in these samples, although mRNA degradation may be of importance as well.
575

Cr:forsterite laser frequency comb stabil[a]zation and development of portable frequency references inside a hollow optical fiber

Thapa, Rajesh January 1900 (has links)
Doctor of Philosophy / Department of Physics / Kristan L. Corwin / We have made significant accomplishments in the development of portable frequency standard inside hollow optical fibers. Such standards will improve portable optical frequency references available to the telecommunications industry. Our approach relies on the development of a stabilized Cr:forsterite laser to generate the frequency comb in the near-IR region. This laser is self referenced and locked to a CW laser which in turn is stabilized to a sub-Doppler feature of a molecular transition. The molecular transition is realized using a hollow core fiber filled with acetylene gas. We finally measured the absolute frequency of these molecular transitions to characterize the references. In this thesis, the major ideas, techniques and experimental results for the development and absolute frequency measurement of the portable frequency references are presented. A prism-based Cr:forsterite frequency comb is stabilized. We have effectively used the prism modulation along with power modulation inside the cavity in order to actively stabilize the frequency comb. We have also studied the carrier-envelope-offset frequency (f0) dynamics of the laser and its effect on laser stabilization. A reduction of f0 linewidth from [similar to]2 MHz to [similar to]20 kHz has also been observed. Both our in-loop and out-of-loop measurements of the comb stability showed that the comb is stable within a part in 10^11 at 1-s gate time and is currently limited by our reference signal. In order to develop this portable frequency standard, saturated absorption spectroscopy is performed on the acetylene v1+v3 band near 1532 nm inside different kinds of hollow optical fibers. The observed linewidths are a factor 2 narrower in the 20 um fiber as compared to 10 um fiber, and vary from 20-40 MHz depending on pressure and power. The 70 um kagome fiber shows a further reduction in linewidth to less than 10 MHz. In order to seal the gas inside the hollow optical fiber, we have also developed a technique of splicing the hollow fiber to solid fiber in a standard commercial arc splicer, rather than the more expensive filament splicer, and achieved comparable splice loss. We locked a CW laser to the saturated absorption feature using a Frequency Modulation technique and then compared to an optical frequency comb. The stabilized frequency comb, providing a dense grid of reference frequencies in near-infrared region is used to characterize and measure the absolute frequency reference based on these hollow optical fibers.
576

CHARACTERIZATION OF G-PATCH MOTIF CONTRIBUTION TO PRP43 FUNCTION IN THE PRE-MESSENGER RNA SPLICING AND RIBOSOMAL RNA BIOGENESIS PATHWAYS

Banerjee, Daipayan 01 January 2013 (has links)
The DExD/H-box protein Prp43 is essential for two biological processes: nucleoplasmic pre-mRNA splicing and nucleolar rRNA maturation. The biological basis for the temporal and spatial regulation of Prp43 remains elusive. The Spp382/Ntr1, Sqs1/Pfa1 and Pxr1/Gno1 G-patch proteins bind to and activate the Prp43 DExD/H box-helicase in pre-mRNA splicing (Spp382) and rRNA processing (Sqs1, Pxr1). These Prp43-interacting proteins each contain the G-patch domain, a conserved sequence of ~48 amino acids that includes 6 highly conserved glycine (G) residues. Five annotated G-patch proteins in baker’s yeast (i.e., Spp382, Pxr1, Spp2, Sqs1 and Ylr271) and with the possible exception of the uncharacterized Ylr271 protein, all are associated with ribonucleoprotein (RNP) complexes. Understanding the role of G-patch proteins in modulating the DExD/H box protein Prp43 biological function was the motivation of this thesis. The G-patch domain has been proposed as a protein-protein or a protein-RNA interaction module for RNP proteins. This study found that the three Prp43-associated G-patch domains interact with Prp43 in a yeast 2 hybrid (Y2H) assay but differ in apparent relative affinities. Using a systemic Y2H analysis, I identified the conserved Winged-helix (WH) domain in Prp43 as a major binding site for G-patch motif. Intriguingly, removal of the non-essential N-terminal domain (NTD) of Prp43 (amino acids 2-94), greatly improves G-patch binding, suggesting that the NTD may play a role in modulating enzyme activity by the G-patch effectors. I identify a second site within the Pxr1 that strongly binds Prp43 but, unlike the G-patch, is dispensable for Pxr1 function in vivo. By constructing chimeric proteins, I demonstrated that individual G-patch peptides differ in the ability to reconstitute Spp382 and Pxr1 function in support of pre-mRNA splicing and rRNA biogenesis, respectively. Through amino acid sequence comparisons and selective mutagenesis I identified several residues within the G-patch motif critical for Prp43-stimulated pre-mRNA splicing without greatly altering its ability to bind Prp43. These data lead me to propose that the G-patch motif is not a simple Prp43 binding interface but may contribute more directly to substrate selection or Prp43 enzyme activation in the biologically distinct pre-mRNA splicing and rRNA processing pathways.
577

Régulation de l'épissage de la télomérase lors de la lymphomagenèse induite par l'herpèsvirus oncogène aviaire de la maladie de marek / Regulation of splicing of the avian telomerase gene during lymphomagenesis induced by an avian oncogenic herpesvirus of Marek disease

Amor, Souheila 10 December 2010 (has links)
La télomérase, composée de l‘ARN TR et de la protéine TERT, responsable du maintien de la longueur des télomères est surexprimée dans la majorité des cellules cancéreuses. La dynamique de la régulation post-transcriptionnelle de TERT sur l‘activation de la télomérase a été étudiée dans le modèle de lymphomagenèse induite par l‘herpesvirus oncogène aviaire de la maladie de Marek. L‘augmentation de l‘activité télomérase des TCD4+ lors de l‘apparition des lymphomes résulte d‘une hausse du transcrit constitutif et de celle des transcrits cibles de la voie de dégradation du « non-sense mediated decay » (NMD) alors que l‘activité télomérase basale des TCD4+ non infectés est contrôlée par les isoformes dominantes négatives. La caractérisation de la protéine virale ICP27 de MDV-1, régulateur potentiel de l‘épissage des gènes, qui s‘exprime pendant la phase de réplication lytique du virus a complété cette étude. ICP27 est capable de co-localiser et d‘interagir avec les protéines SR du splicéosome ainsi que de réguler négativement l‘épissage des gènes cellulaire TERT et viral vIL8 de manière similaire à ICP27 de l‘herpesvirus simplex 1. Le modèle naturel de lymphomagenèse induite par MDV-1 a permis d‘établir pour la première fois un lien entre l‘activation de la télomérase in vivo et la régulation de l‘épissage de TERT, à laquelle pourrait participer la protéine virale ICP27. / The telomerase, consisting of an RNA template (TR) and a reverse transcriptase (TERT) maintains telomere length and is highly expressed in the majority of cancer cells. The splicing regulation of TERT was studied in Marek‘s disease (MD), a natural lymphoma induced by MDV-1, the avian MD herpesvirus. Telomerase activation observed in TCD4+ cells at the onset of MD lymphoma was due to an increase of constitutively spliced and « non-sense mediated decay » (NMD) while basal telomerase activity of non infected TCD4+ cells was controlled by dominant negative isoforms. In addition, the viral protein ICP27, a putative regulator of splicing, expressed during MDV-1 lytic infection was characterised. ICP27 co-localized and interacted with spliceosome SR proteins and negatively controlled splicing of TERT and vIL8 viral gene in a way similar to that of ICP27 of herpesvirus simplex 1. The MD model provides the only data on the in vivo regulation of TERT splicing, possibly mediated by ICP27, and telomerase activation during lymphomagenesis induced by a herpesvirus in its natural host.
578

Influence de TDP-43 sur la régulation de hnRNP A1 : un impact potentiel sur la sclérose latérale amyotrophique

Stabile, Stéphanie 12 1900 (has links)
La SLA est une maladie neurodégénérative fatale se déclenchant tardivement. Elle est caractérisée par la perte des neurones moteurs supérieurs et inférieurs. Jusqu’à présent, aucun traitement ne permet de ralentir ou de guérir la maladie de façon robuste. De récentes découvertes portant sur TDP-43 et hnRNP A1 y ont identifié des mutations reliées à des cas de SLA. Comme les deux possèdent de multiples fonctions dans le métabolisme de l’ARN, l’impact de ces mutations devient difficile à définir. Notre hypothèse est que TDP-43 régule hnRNP A1 et que les mutations causant la SLA dérégulent ce mécanisme, aboutissant ainsi à un impact majeur sur la vulnérabilité des neurones moteurs. Nos résultats démontrent que TDP-43 lie l’ARNm de hnRNP A1, mais n’affecte pas sa stabilité. En revanche, TDP-43 réprime l’expression de hnRNP A1. Ce mécanisme pourrait être appliqué in vivo où le ratio protéique hnRNP A1B/A1 augmente chez les souris âgées et davantage chez les TDP-43A315T dans la région cervicale et lombaire de la moelle épinière. Cette différence n’est pas causée par un défaut de l’épissage alternatif. Aussi, la mutation TDP-43A315T serait davantage responsable de cette différence que la surexpression de TDP-43 (résultats obtenus en culture). L’impact d’une telle augmentation sur la cellule pourrait être la formation d’agrégats puisque la forme hnRNP A1B possède quatre domaines de fibrillation de plus que hnRNP A1. Nos résultats pourraient donc fournir un mécanisme potentiel de la formation d’inclusions cytoplasmiques reconnues comme étant une des caractéristiques pathologiques principales de la SLA. / ALS is a fatal and late onset disease characterized by the selective loss of lower and upper motor neurons. Yet, there is no way to robustly slow or cure the disease. Recent discoveries concern TDP-43 and hnRNP A1 where mutations have been identified in ALS cases. Both have multiple functions in RNA metabolism, making the impact of mutations difficult to define. Our hypothesis is that TDP-43 regulates hnRNP A1 and that the ALS causative mutations deregulate this mechanism, having a major impact on the vulnerability of motor neurons. Our results demonstrate that TDP-43 binds hnRNP A1 mRNA, but does not affect its stability. In contrast, TDP-43 represses the expression of hnRNP A1. This mechanism could be applied in vivo where hnRNP A1B/A1 protein ratio increases in aged mice and even more in TDP-43A315T mice in the cervical and lumbar region of the spinal cord. This difference is not caused by a defect in alternative splicing. Also, the TDP-43A315T mutation would be more responsible for this difference than the overexpression of TDP-43 (result from cell culture). The impact of that increased on the cell could be the formation of aggregates since the shape of hnRNP A1B has four more areas of fibrillation than hnRNP A1. Our findings could thus provide a potential mechanism for the formation of cytoplasmic inclusions recognized as one of the main pathological features of ALS.
579

Étude protéomique de la microhétérogénéité des caséines [alpha]s1 et [bêta] équines : identification des variants transcriptionnels et de phosphorylation ; identification des sites phosphorylés de la caséine [bêta] / Proteomic study of the microheterogeneity of equine [alpha]s1 and [bêta] caseins : identification of post-transcriptional and phosphorylation variants ; identification of phosphorylated sites of [beta] casein

Mateos, Aurélie 21 November 2008 (has links)
La caséine [bêta] (CN-[bêta]) et la caséine [alpha]s1 (CN-[alpha]s1) du lait de jument possèdent un taux variable de phosphorylation et sont de bons modèles d’étude de l’influence du degré de phosphorylation et de la séquence peptidique sur la chélation de caséinophosphopeptides (CPP) avec des minéraux d’intérêt nutritionnel. Avant d’envisager une telle étude, la structure des caséines doit être déterminée précisément. Notre travail a été consacré à la caractérisation de variants post-transcriptionnels et post-traductionnels de CN-[alpha]s1 et de CN-[bêta]. Après fractionnement chromatographique et analyse par spectrométrie de masse, la CN-[alpha]s1 entière, trois variants d’épissage alternatif des exons 7 et 14 de la CN-[alpha]s1 et quatre variants délétés du résidu Gln91, résultat de l’utilisation d’un site d’épissage cryptique, ont été identifiés dans le lait équin. Nous avons montré que le degré de phosphorylation de ces isoformes varie de 2 à 8 groupes phosphates. Au total, 36 isoformes différentes de CN-[alpha]s1 ont été caractérisées. La cartographie bidimensionnelle de la CN-[bêta] a été établie avec précision après avoir isolé par chromatographie chacun des variants de phosphorylation (possédant de 3 à 7 groupes phosphates) et après avoir caractérisé les formes de CN-[bêta] modifiées par désamidation non enzymatique du résidu Asn135. Des CPP trypsiques de chaque variant de phosphorylation ont été préparés avec une technique récente de chromatographie d’affinité au dioxyde de titane, ce qui a permis de localiser par spectrométrie de masse en tandem les sites phosphorylés de la CN-[bêta] (Ser9, Ser10, Thr12, Ser18, Ser23, Ser24, Ser25) et de montrer que la phosphorylation de la CN-[bêta] n’est pas aléatoire mais séquentielle / Equine [bêta]-casein ([bêta]-CN) and [alpha]s1-casein ([alpha]s1-CN) have a variable phosphorylation degree and are good models for the study of the influence of phosphorylation degree and peptide sequence on chelation of caseinophosphopeptides (CPP) with minerals of nutritional interest. Before considering such study, structure of caseins must be precisely determined. Our work has been devoted to the characterization of post-transcriptional and post-translational variants of [alpha]s1-CN and [bêta]-CN. Concerning [alpha]s1-CN, the full-length protein, three alternative splicing variants involving exons 7 and 14 and four variants involving cryptic splice site resulting in deletion of residue Gln91 have been identified in mare’s milk after chromatographic fractionation and mass spectrometric analysis. The phosphorylation degree of these variants varies between 2 and 8 phosphate groups. Finally, 36 isoforms of [alpha]s1-CN have been identified. Isolation of each phosphorylation variant (having 3 to 7 phosphate groups) of [bêta]-CN by chromatography, and characterization of modified forms of [bêta]-CN by non enzymatic deamidation of residue Asn135 permits the establishment of bidimensional cartography of [bêta]-CN with precision. After hydrolysis by trypsin, CPP of each phosphorylation variant have been prepared by affinity chromatography to titanium dioxide, a recent technology, which allowed to locate by mass tandem spectrometry the phosphorylated sites of [bêta]-CN (Ser9, Ser10, Thr12, Ser18, Ser23, Ser24, Ser25). It was shown that the phosphorylation of [bêta]-CN is not a random process but follows a sequential way
580

Semi-supervised and transductive learning algorithms for predicting alternative splicing events in genes.

Tangirala, Karthik January 1900 (has links)
Master of Science / Department of Computing and Information Sciences / Doina Caragea / As genomes are sequenced, a major challenge is their annotation -- the identification of genes and regulatory elements, their locations and their functions. For years, it was believed that one gene corresponds to one protein, but the discovery of alternative splicing provided a mechanism for generating different gene transcripts (isoforms) from the same genomic sequence. In the recent years, it has become obvious that a large fraction of genes undergoes alternative splicing. Thus, understanding alternative splicing is a problem of great interest to biologists. Supervised machine learning approaches can be used to predict alternative splicing events at genome level. However, supervised approaches require large amounts of labeled data to produce accurate classifiers. While large amounts of genomic data are produced by the new sequencing technologies, labeling these data can be costly and time consuming. Therefore, semi-supervised learning approaches that can make use of large amounts of unlabeled data, in addition to small amounts of labeled data are highly desirable. In this work, we study the usefulness of a semi-supervised learning approach, co-training, for classifying exons as alternatively spliced or constitutive. The co-training algorithm makes use of two views of the data to iteratively learn two classifiers that can inform each other, at each step, with their best predictions on the unlabeled data. We consider three sets of features for constructing views for the problem of predicting alternatively spliced exons: lengths of the exon of interest and its flanking introns, exonic splicing enhancers (a.k.a., ESE motifs) and intronic regulatory sequences (a.k.a., IRS motifs). Naive Bayes and Support Vector Machine (SVM) algorithms are used as based classifiers in our study. Experimental results show that the usage of the unlabeled data can result in better classifiers as compared to those obtained from the small amount of labeled data alone. In addition to semi-supervised approaches, we also also study the usefulness of graph based transductive learning approaches for predicting alternatively spliced exons. Similar to the semi-supervised learning algorithms, transductive learning algorithms can make use of unlabeled data, together with labeled data, to produce labels for the unlabeled data. However, a classification model that could be used to classify new unlabeled data is not learned in this case. Experimental results show that graph based transductive approaches can make effective use of the unlabeled data.

Page generated in 0.0444 seconds