Spelling suggestions: "subject:"lineage designation"" "subject:"lineage resignation""
1 |
COVID-19 Variant Analyzer through Genomic Sequences and Jaccard SimilaritiesBharadwaj, Atul Narasimha Murthy 26 March 2025 (has links)
The COVID-19 pandemic has underscored the urgent need for efficient genomic surveillance to track the emergence and spread of SARS-CoV-2 variants. This study developed a novel computational framework to enhance variant detection by leveraging a database-driven approach and genomic sequence analysis. The framework utilizes MySQL database architecture where each variant is stored in distinct tables, enabling rapid comparison and classification of new variants through Jaccard similarity calculations.
The innovative aspect of this research lies in its unique database structure and classification method. Unlike traditional clustering approaches, this system creates individual tables for each variant, allowing for dynamic updates and efficient comparisons. When a new variant is introduced, the framework calculates Jaccard similarity scores between the new variant and existing variant tables, automatically creating new tables for potentially novel variants that fall below-established similarity thresholds. This approach enables real-time variant tracking and classification, adapting to the evolving nature of the virus.
The system employs advanced bioinformatics tools including sourmash for signature generation and NumPy for computational analysis, alongside Python-MySQL connectors for seamless database interactions. It implements similarity thresholds of 0.817 for primary classification and 0.867 for secondary validation to determine variant group membership. Whole-genome data was analyzed to compare its effectiveness in identifying variants of concern, with the database structure accommodating genomic data.
The results demonstrated the framework's ability to accurately detect and classify SARS-CoV-2 variants with high sensitivity and specificity. The study highlighted the potential of whole-genome sequences as a cost-effective alternative for variant detection in resource-limited settings, while also revealing their limitations compared to whole-genome analysis. This research contributes to global genomic surveillance efforts by providing scalable database tools for rapid variant identification, aiding public health strategies, vaccine development, and therapeutic interventions. / Master of Science / The COVID-19 pandemic has shown how important it is to track changes in the COVID-19 virus. This study focused on creating better ways to find and classify new versions of the virus (variants) by analyzing its genetic material. Using bioinformatics tools, the research aimed to make it easier and faster to identify these variants and understand how they are related.
The project used methods like comparing virus genomes and grouping similar ones to see how they evolve. It also tested whether analyzing only part of the virus's genetic material could be as effective as looking at the whole genome. These techniques helped identify patterns in the virus's mutations and group them into meaningful categories.
This work is important because it provides tools that can help scientists quickly spot new or dangerous variants of COVID-19. These findings can guide public health decisions, improve vaccines, and develop treatments more effectively. By making these methods scalable and accessible, this research supports global efforts to manage the ongoing pandemic and prepare for future outbreaks. Read more
|
Page generated in 0.0991 seconds