Harvest Variants

Real-time monitoring of viral mutations of circulating SARS-CoV-2 lineages, both at both the intrahost and interhost level, is a key step in understanding changes to SARS-CoV-2 infectivity/transmissibility, vaccine efficacy, and fitness within human hosts. Since the onset of the COVID-19 pandemic, there have been several variants of note in the Spike glycoprotein that have been linked to increased infectivity and are under active investigation, including A701B, D614G, E484K, K417N, N501Y, and P681H. Previous work by our group includes the Harvest package, which features three software tools: (1) Parsnp: multiple genome alignment and SNP typing, (2) Harvest tools: variant analysis file conversion and fasta data interchange format, and (3) Gingr: interactive graphical user interface for simultaneous visualization of variants, phylogeny, synteny, annotations of hundreds of thousands of genomes. The Harvest software suite, including Parsnp, harvest tools, and Gingr, were originally designed for intraspecific multiple genome alignment, variant detection, and simultaneous visualization of phylogeny and multiple sequence alignments, respectively. The Harvest suite was published 2014, and has been available on Github (https://github.com/marbl/harvest) since 2013 and supported for over 7 years. While these tools have been widely adopted by the community, they require several improvements to maximize their potential for integrated, collaborative variant tracking of SARS-CoV-2. Here we propose the development of Harvest Variants, a SARS-CoV-2 specific enhancement tio harvest tools for (i) adding support for minor variants to harvest tools, (ii) algorithmic enhancements to parsnp and harvest tools and (iii) genotype-to-phenotype tracking of curated SARS-CoV-2 variants within the Gingr graphical user interface.

Advait Balaji
Advait Balaji
PhD student

Advait (4th year PhD student) obtained a dual degree, B.E Computer Science and MS Biological Sciences from BITS, Pilani in India. During his undergraduate degree, he received the Khorana Scholarship (2016) from the Indo-US Science and Technology Forum and also a thesis fellowship (2017-18) to work at Icahn School of Medicine, Mount Sinai, NY. At Mount Sinai, he worked on creating a Sub-cellular process-based ontology that predicts whole cell function using Natural Language Processing. His research interests are at the intersection of genomic data science and designing efficient algorithms to analyze genomic data.

Bryce Kille
Bryce Kille
PhD student

Bryce (1st year PhD student) received his MS in Bioinformatics and BS in Computer Science + Chemistry from the University of Illinois at Urbana-Champaign. As an undergraduate, he worked at Dow Agrosciences in both the computational biology and cheminformatics groups. His projects included developing software for phylogeny analysis and creating models for compound activity prediction. During his Master’s program, Bryce worked in a biochemistry lab developing software for genome mining as well as a on research project for creating bit-wise algorithms for the C++ STL. One of his main interests is casting biological and chemical problems into theoretical computer science questions.

Dr. Mike Nute
Dr. Mike Nute
Postdoctoral Scientist

Mike (Postdoctoral Scientist) received his Ph.D. in Statistics in 2019 from the University of Illinois at Urbana-Champaign where he was advised by Dr. Tandy Warnow in the Department of Computer Science and worked on algorithms related to multiple sequence alignment and phylogenetic tree estimation, in particular applying these methods to studying microbial communities. He was co-advised by Dr. Rebecca Stumpf in the Department of Anthropology where he and other lab members developed novel methods to compare the microbiomes of human and non-human primates. His research interest is in discovering a new applications for our understanding of microbial communities.

R. Matt Barnett
R. Matt Barnett
Senior Research Programmer

Matt (Senior Research Programmer) received his Undergraduate Degree in Computer Science From Rice University. He provides professional software engineering support to multiple research projects in the Treangen lab.

Todd J. Treangen
Todd J. Treangen
Assistant Professor of Computer Science

My research interests include algorithms and data structures for efficient analysis of microbial genomes and metagenomes

Next
Previous

Related