Limited genomic reconstruction of SARS-CoV-2 transmission history within local epidemiological clusters


A detailed understanding of how and when SARS-CoV-2 transmission occurs is crucial for designing effective prevention measures. Other than contact tracing, genome sequencing provides information to help infer who infected whom. However, the effectiveness of the genomic approach in this context depends on both (high enough) mutation and (low enough) transmission rates. Today, the level of resolution that we can obtain when describing SARS-CoV-2 outbreaks using just genomic information alone remains unclear. In order to answer this question, we sequenced 49 SARS-CoV-2 patient samples from ten local clusters for which partial epidemiological information was available, and inferred transmission history using genomic variants. Importantly, we obtained high-quality genomic data, sequencing each sample twice and using unique barcodes to exclude cross-sample contamination. Phylogenetic and cluster analyses showed that consensus genomes were generally sufficient to discriminate among independent transmission clusters. However, levels of intrahost variation were low, which prevented in most cases the unambiguous identification of direct transmission events. After filtering out recurrent variants across clusters, the genomic data were generally compatible with the epidemiological information but did not support specific transmission events over possible alternatives. We estimated the effective transmission bottleneck size to be 1-2 viral particles for sample pairs whose donor-recipient relationship was likely. Our analyses suggest that intrahost genomic variation in SARS-CoV-2 might be generally limited and that homoplasy and recurrent errors complicate identifying shared intrahost variants. Reliable reconstruction of direct SARS-CoV-2 transmission based solely on genomic data seems hindered by a slow mutation rate, potential convergent events, and technical artifacts. Detailed contact tracing seems essential in most cases to study SARS-CoV-2 transmission at high resolution.

Nicolae Sapoval
Nicolae Sapoval
PhD student

Nick (3rd year PhD student) obtained a B.S. degree in Computer Science and a B.S. with Honors in Mathematics from the University of Chicago. At the University of Chicago Nick worked in wireless networks research and later in computational biophysics focusing on conformational transition modeling for insulin degrading enzyme. His current interests are in the areas of computational biology with a focus on genomic data.