Gene expression is a multi-step process that involves various regulators. From whole genome sequences to the complex gene regulatory system, high-throughput technologies have generated a large amount of omics data, but information in such a large scale is hard to interpret manually. Bioinformatics can help to process this huge biological information and infer biological insights using the merits of mathematics, statistics and computational techniques. In this study, we applied various bioinformatic techniques on gene regulation in several aspects.
Multiple primary transcripts of a gene can be initiated at different promoters, termed alternative promoters (APs). Most human genes have multiple APs. However, whether the usage of APs is independent or not is still controversial. In this study, we analyze the roles of APs in gene regulations using various bioinformatics approaches. Chromosomal interactions between APs are found to be more frequent than interactions between different genes. By comparing the APs at two ends of the genes, we find that they are significant different in terms of sequence content, conservation and motif frequency. The position and distance of two APs are important for their combined effects, which prove their regulations are not independent and one AP could affect the transcription of the other.
With the aim to understand the multi-level gene regulatory system in various biological processes, a mass of high-throughput omics data have been generated. However, each omics technology measuring the molecular abundance or behavior at a single level has a limited ability to depict the multi-level system. Integrating omics data can effectively comprehend the multi-level gene regulatory system and reduce the false positives. In this study, two web servers, ChIP-Array and ProteoMirExpress, have been built to construct transcriptional and post-transcriptional regulatory networks by integrating omics data. ChIP-Array is a web server for biologists to construct a TF-centered network for their own data. Network library is further constructed by ChIP-Array from publicly available data. Given a series mRNA expression profiles in a biological process, master regulators can be identified by matching the profiles with the networks in the library. To explore gene regulatory network controlled by multiple TFs, least absolute shrinkage and selection operator (LASSO)-type regularization models are applied on multiple integrative data. Golden standard based evaluations demonstrate that the L0 and L1/2 regularization models are efficient and applicable to gene regulatory network inference in large genome with a small number of samples. ProteoMirExpress integrates transcriptomic and proteomic data to infer miRNA-centered networks. It successfully infers the perturbed miRNA and those that co-express with it. The resulting network reports miRNA targets with uncorrelated mRNA and protein levels, which are usually ignored by tools considering only the mRNA abundance, even though some of them may be important downstream regulators.
In summary, in this study we analyze gene regulation at multiple levels and develop several tools for gene network construction and regulator analysis with multiple omics data. It benefits researchers to efficiently process high-throughput raw data and to draw biological hypotheses and interpretation. / published_or_final_version / Biochemistry / Doctoral / Doctor of Philosophy
Identifer | oai:union.ndltd.org:HKU/oai:hub.hku.hk:10722/205684 |
Date | January 2013 |
Creators | Qin, Jing, 覃静 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Source Sets | Hong Kong University Theses |
Language | English |
Detected Language | English |
Type | PG_Thesis |
Rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works., Creative Commons: Attribution 3.0 Hong Kong License |
Relation | HKU Theses Online (HKUTO) |
Page generated in 0.0024 seconds