Global ETD Search

Return to search

Finding the Most Predictive Data Source in Biological Data

Classification can be used to predict unknown functions of proteins by using known function information. In some cases, multiple sets of data are available for classification where prediction is only part of the problem, and knowing the most reliable source for prediction is also relevant. Our goal is to develop classification techniques to find the most predictive of the multiple data sets that we have in this project. We use existing classification techniques like linear and quadratic classifications and statistical relevance measures like posterior and log p analysis in our proposed algorithm, which is able to find the data set that is expected to give the best prediction. The proposed algorithm is used on experimental readings during cell cycle of yeast and it predicts the genes that participate in cell-cycle regulation and the type of experiment that provides evidence of cell cycle involvement for any particular gene.

https://hdl.handle.net/10365/26567

Data mining

Cell cycle

Yeast

Science -- Methodology

Identifer	oai:union.ndltd.org:ndsu.edu/oai:library.ndsu.edu:10365/26567
Date	January 2013
Creators	Chakraborty, Ushashi
Publisher	North Dakota State University
Source Sets	North Dakota State University
Detected Language	English
Type	text/thesis
Format	application/pdf
Rights	NDSU Policy 190.6.2, https://www.ndsu.edu/fileadmin/policy/190.pdf

Page generated in 0.0022 seconds

Finding the Most Predictive Data Source in Biological Data

Description

Links & Downloads

Tags

Additional Fields