Return to search

Application of machine learning for the clustering of wheat transcription factor proteins into families and sub-families

Wheat plays an important role in ensuring the global food security. Salinity of soil and water poses a major threat to its production and it affects both growth and development of wheat in a negative way. Wheat plants uses certain molecular mechanisms to adapt themselves under the salt stress.Transcription factor proteins are the proteins that control the response of the wheat towards abiotic stress like salinity.There are 56 transcription factor protein families in the wheat genome. However these transcription factor protein families are not classified into subfamilies.The main goal of this research study is to understand how machine learning algorithm can be used to identify and cluster the transcription factor proteins into sub families that can help in associating them with specific biological processes like salt stress. In this project K Mean Clustering method is used to cluster the WRKY transcription factor family into subfamilies. WRKY is identified and clustered into three distinct clusters. Cluster validation is performed using external validation and resulted in 90% validation score. This method can be applied to other transcription factor families also. This can ultimately be helpful in producing salt-tolerant varieties of the wheat that are resistant to abiotic stress like salinity and this can help to improve crop yield.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:su-224917
Date January 2022
CreatorsSameer, Haleemath Sameena
PublisherStockholms universitet, Institutionen för data- och systemvetenskap
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0019 seconds