1887

Abstract

Wastewater-based epidemiology has been used extensively throughout the COVID-19 (coronavirus disease 19) pandemic to detect and monitor the spread and prevalence of SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) and its variants. It has proven an excellent, complementary tool to clinical sequencing, supporting the insights gained and helping to make informed public-health decisions. Consequently, many groups globally have developed bioinformatics pipelines to analyse sequencing data from wastewater. Accurate calling of mutations is critical in this process and in the assignment of circulating variants; yet, to date, the performance of variant-calling algorithms in wastewater samples has not been investigated. To address this, we compared the performance of six variant callers (VarScan, iVar, GATK, FreeBayes, LoFreq and BCFtools), used widely in bioinformatics pipelines, on 19 synthetic samples with known ratios of three different SARS-CoV-2 variants of concern (VOCs) (Alpha, Beta and Delta), as well as 13 wastewater samples collected in London between the 15th and 18th December 2021. We used the fundamental parameters of recall (sensitivity) and precision (specificity) to confirm the presence of mutational profiles defining specific variants across the six variant callers. Our results show that BCFtools, FreeBayes and VarScan found the expected variants with higher precision and recall than GATK or iVar, although the latter identified more expected defining mutations than other callers. LoFreq gave the least reliable results due to the high number of false-positive mutations detected, resulting in lower precision. Similar results were obtained for both the synthetic and wastewater samples.

Funding
This study was supported by the:
  • NERC grant (Award NE/V010441/1)
    • Principle Award Recipient: TerryA. Burke
  • This is an open-access article distributed under the terms of the Creative Commons Attribution License. This article was made open access via a Publish and Read agreement between the Microbiology Society and the corresponding author’s institution.
Loading

Article metrics loading...

/content/journal/mgen/10.1099/mgen.0.000933
2023-04-19
2024-03-28
Loading full text...

Full text loading...

/deliver/fulltext/mgen/9/4/mgen000933.html?itemId=/content/journal/mgen/10.1099/mgen.0.000933&mimeType=html&fmt=ahah

References

  1. Cucinotta D, Vanelli M. WHO declares COVID-19 a pandemic. Acta Biomed 2020; 91:157–160 [View Article]
    [Google Scholar]
  2. Aguiar-Oliveira M de L, Campos A, R Matos A, Rigotto C, Sotero-Martins A et al. Wastewater-based epidemiology (WBE) and viral detection in polluted surface water: a valuable tool for COVID-19 surveillance-a brief review. Int J Environ Res Public Health 2020; 17:24 [View Article]
    [Google Scholar]
  3. Peccia J, Zulli A, Brackney DE, Grubaugh ND, Kaplan EH et al. Measurement of SARS-CoV-2 RNA in wastewater tracks community infection dynamics. Nat Biotechnol 2020; 38:1164–1167 [View Article] [PubMed]
    [Google Scholar]
  4. Sutton M, Radniecki TS, Kaya D, Alegre D, Geniza M et al. Detection of SARS-CoV-2 B.1.351 (Beta) variant through wastewater surveillance before case detection in a community, Oregon, USA. Emerg Infect Dis 2022; 28:1101–1109 [View Article]
    [Google Scholar]
  5. Mallapaty S. How sewage could reveal true scale of coronavirus outbreak. Nature 2020; 580:176–177 [View Article] [PubMed]
    [Google Scholar]
  6. World Health Organization Tracking SARS-CoV-2 variants; 2022 https://www.who.int/en/activities/tracking-SARS-CoV-2-variants/
  7. NCBI SARS-CoV-2 variants overview; 2022 https://www.ncbi.nlm.nih.gov/activ
  8. UKHSA Emerging infections: Horizon Scanning Programme; 2010 https://www.gov.uk/government/collections/emerging-infections
  9. UKHSA Investigation of SARS-CoV-2 Variants: Technical Briefings London: UK Health Security Agency; 2022
    [Google Scholar]
  10. UKHSA UK Completes Over 2 Million SARS-CoV-2 Whole Genome Sequences London: UK Health Security Agency; 2022
    [Google Scholar]
  11. Xiao A, Wu F, Bushman M, Zhang J, Imakaev M et al. Metrics to relate COVID-19 wastewater data to clinical testing dynamics. Water Res 2022; 212:118070
    [Google Scholar]
  12. Wolfe MK, Topol A, Knudson A, Simpson A, White B et al. High-frequency, high-throughput quantification of SARS-CoV-2 RNA in wastewater settled solids at eight publicly owned treatment works in Northern California shows strong association with COVID-19 incidence. mSystems 2021; 6:e0082921 [View Article]
    [Google Scholar]
  13. Weidhaas J, Aanderud ZT, Roper DK, VanDerslice J, Gaddis EB et al. Correlation of SARS-CoV-2 RNA in wastewater with COVID-19 disease burden in sewersheds. Sci Total Environ 2021; 775:145790 [View Article] [PubMed]
    [Google Scholar]
  14. Peinado B, Martínez-García L, Martínez F, Nozal L, Sánchez MB. Improved methods for the detection and quantification of SARS-CoV-2 RNA in wastewater. Sci Rep 2022; 12:7201 [View Article] [PubMed]
    [Google Scholar]
  15. Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J et al. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol 2020; 38:276–278 [View Article] [PubMed]
    [Google Scholar]
  16. Posada-Céspedes S, Seifert D, Topolsky I, Jablonski KP, Metzner KJ et al. V-pipe: a computational pipeline for assessing viral genetic diversity from high-throughput data. Bioinformatics 2021; 37:1673–1680 [View Article]
    [Google Scholar]
  17. Grubaugh ND, Gangavarapu K, Quick J, Matteson NL, De Jesus JG et al. An amplicon-based sequencing framework for accurately measuring intrahost virus diversity using PrimalSeq and iVar. Genome Biol 2019; 20:8 [View Article] [PubMed]
    [Google Scholar]
  18. Koboldt DC, Chen K, Wylie T, Larson DE, McLellan MD et al. VarScan: variant detection in massively parallel sequencing of individual and pooled samples. Bioinformatics 2009; 25:2283–2285 [View Article] [PubMed]
    [Google Scholar]
  19. Brown MR, Wade MJ, McIntyre-Nolan S, Bassano I, Denise H et al. Wastewater Monitoring of SARS-CoV-2 Variants in England: Demonstration Case Study for Bristol (Dec 2020–March 2021). Summary for SAGE 08/04/21 London: Scientific Advisory Group for Emergencies; 2021
    [Google Scholar]
  20. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K et al. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 2010; 20:1297–1303 [View Article]
    [Google Scholar]
  21. Wilm A, Aw PPK, Bertrand D, Yeo GHT, Ong SH et al. LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets. Nucleic Acids Res 2012; 40:11189–11201 [View Article] [PubMed]
    [Google Scholar]
  22. Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing; 2012 https://doi.org/10.48550/arXiv.1207.3907
  23. Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V et al. Twelve years of SAMtools and BCFtools. Gigascience 2021; 10:giab008 [View Article]
    [Google Scholar]
  24. Olm MR, Crits-Christoph A, Bouma-Gregson K, Firek BA, Morowitz MJ et al. inStrain profiles population microdiversity from metagenomic data and sensitively detects shared microbial strains. Nat Biotechnol 2021; 39:727–736 [View Article] [PubMed]
    [Google Scholar]
  25. Costea PI, Munch R, Coelho LP, Paoli L, Sunagawa S et al. metaSNV: a tool for metagenomic strain level analysis. PLoS One 2017; 12:e0182392 [View Article]
    [Google Scholar]
  26. Jeffries A, Child HT, Paterson S, Loose M, van Aerle R. Wastewater sequencing using the EasySeq RC-PCR SARS CoV-2 (Nimagen) V2.0 V.2. 2022; 2022 https://www.protocols.io/view/wastewater-sequencing-using-the-easyseq-rc-pcr-sar-81wgb7bx3vpk/v2
  27. Loman N, Rowe W, Rambau A. nCoV-2019 novel coronavirus bioinformatics protocol; 2020 https://artic.network/ncov-2019/ncov2019-bioinformatics-sop.html
  28. Krueger F, James F, Ewels P, Afyounian E, Schuster-Boeckler B. Trim Galore; 2021 https://zenodo.org/record/5127899#.YoQSyXXMI2w
  29. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009; 25:1754–1760 [View Article] [PubMed]
    [Google Scholar]
  30. Schilbert HM, Rempel A, Pucker B. Comparison of read mapping and variant calling tools for the analysis of plant NGS data. Plants 2020; 9:439 [View Article]
    [Google Scholar]
  31. Xu C. A review of somatic single nucleotide variant calling algorithms for next-generation sequencing data. Comput Struct Biotechnol J 2018; 16:15–24 [View Article] [PubMed]
    [Google Scholar]
  32. Sandmann S, de Graaf AO, Karimi M, van der Reijden BA, Hellström-Lindberg E et al. Evaluating variant calling tools for non-matched next-generation sequencing data. Sci Rep 2017; 7:43169 [View Article]
    [Google Scholar]
  33. Albers CA, Lunter G, MacArthur DG, McVean G, Ouwehand WH et al. Dindel: accurate indel calls from short-read data. Genome Res 2011; 21:961–973 [View Article] [PubMed]
    [Google Scholar]
  34. Patel H, Varona S, Monzón S, Espinosa-Carrasco J, Heuer ML et al. nf-core/viralrecon: nf-core/viralrecon v2.5 – Manganese Monkey; 2022 https://zenodo.org/record/6827984#.Yxm4OKHMI2w
  35. Deng Z-L, Dhingra A, Fritz A, Götting J, Münch PC et al. Evaluating assembly and variant calling software for strain-resolved analysis of large DNA viruses. Brief Bioinform 2021; 22:bbaa123 [View Article]
    [Google Scholar]
  36. Schmidt J, Berghaus S, Blessing F, Herbeck H, Blessing J et al. Genotyping of familial Mediterranean fever gene (MEFV)–single nucleotide polymorphism–comparison of Nanopore with conventional Sanger sequencing. PLoS One 2022; 17:e0265622 [View Article]
    [Google Scholar]
  37. Parikh R, Mathai A, Parikh S, Chandra Sekhar G, Thomas R. Understanding and using sensitivity, specificity and predictive values. Indian J Ophthalmol 2008; 56:45–50 [View Article] [PubMed]
    [Google Scholar]
  38. Olson ND, Lund SP, Colman RE, Foster JT, Sahl JW et al. Best practices for evaluating single nucleotide variant calling methods for microbial genomics. Front Genet 2015; 6:235 [View Article] [PubMed]
    [Google Scholar]
  39. Wickham H. ggplot2: Elegant Graphics for Data Analysis Cham: Springer; 2016 [View Article]
    [Google Scholar]
  40. Garrison E, Kronenberg ZN, Dawson ET, Pedersen BS, Prins P. Vcflib and tools for processing the VCF variant call format. bioRxiv 2021445151 [View Article]
    [Google Scholar]
  41. Pogka V, Labropoulou S, Emmanouil M, Voulgari-Kokota A, Vernardaki A et al. Laboratory surveillance of polio and other enteroviruses in high-risk populations and environmental samples. Appl Environ Microbiol 2017; 83:e02872-16 [View Article]
    [Google Scholar]
  42. Pavlov DN, Van Zyl WB, Van Heerden J, Grabow WOK, Ehlers MM. Prevalence of vaccine-derived polioviruses in sewage and river water in South Africa. Water Res 2005; 39:3309–3319 [View Article] [PubMed]
    [Google Scholar]
  43. Paul JR, Trask JD, Gard S. Poliomyelitic virus in urban sewage. J Exp Med 1940; 71:765–777 [View Article]
    [Google Scholar]
  44. Nakamura T, Hamasaki M, Yoshitomi H, Ishibashi T, Yoshiyama C et al. Environmental surveillance of poliovirus in sewage water around the introduction period for inactivated polio vaccine in Japan. Appl Environ Microbiol 2015; 81:1859–1864 [View Article] [PubMed]
    [Google Scholar]
  45. Metcalf TG, Melnick JL, Estes MK. Environmental virology: from detection of virus in sewage and water by isolation to identification by molecular biology – a trip of over 50 years. Annu Rev Microbiol 1995; 49:461–487 [View Article]
    [Google Scholar]
  46. Tran HN, Le GT, Nguyen DT, Juang R-S, Rinklebe J et al. SARS-CoV-2 coronavirus in water and wastewater: a critical review about presence and concern. Environ Res 2021; 193:110265 [View Article]
    [Google Scholar]
  47. La Rosa G, Bonadonna L, Lucentini L, Kenmoe S, Suffredini E. Coronavirus in water environments: occurrence, persistence and concentration methods – a scoping review. Water Res 2020; 179:115899 [View Article]
    [Google Scholar]
  48. Kitajima M, Ahmed W, Bibby K, Carducci A, Gerba CP et al. SARS-CoV-2 in wastewater: state of the knowledge and research needs. Sci Total Environ 2020; 739:139076 [View Article]
    [Google Scholar]
  49. Foladori P, Cutrupi F, Segata N, Manara S, Pinto F et al. SARS-CoV-2 from faeces to wastewater treatment: what do we know? A review. Sci Total Environ 2020; 743:140444 [View Article]
    [Google Scholar]
  50. Ahmed W, Angel N, Edson J, Bibby K, Bivins A et al. First confirmed detection of SARS-CoV-2 in untreated wastewater in Australia: a proof of concept for the wastewater surveillance of COVID-19 in the community. Sci Total Environ 2020; 728:138764 [View Article]
    [Google Scholar]
  51. Sangkham S. A review on detection of SARS-CoV-2 RNA in wastewater in light of the current knowledge of treatment process for removal of viral fragments. J Environ Manage 2021; 299:113563 [View Article] [PubMed]
    [Google Scholar]
  52. Corpuz MVA, Buonerba A, Vigliotta G, Zarra T, Ballesteros F Jr et al. Viruses in wastewater: occurrence, abundance and detection methods. Sci Total Environ 2020; 745:140910 [View Article] [PubMed]
    [Google Scholar]
  53. Jahn K, Dreifuss D, Topolsky I, Kull A, Ganesanandamoorthy P et al. Early detection and surveillance of SARS-CoV-2 genomic variants in wastewater using COJAC. Nat Microbiol 2022; 7:1151–1160 [View Article]
    [Google Scholar]
  54. Cornish A, Guda C. A comparison of variant calling pipelines using genome in a bottle as a reference. Biomed Res Int 2015; 2015:456479 [View Article]
    [Google Scholar]
  55. Bian X, Zhu B, Wang M, Hu Y, Chen Q et al. Comparing the performance of selected variant callers using synthetic data and genome segmentation. BMC Bioinformatics 2018; 19:429 [View Article]
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/mgen/10.1099/mgen.0.000933
Loading
/content/journal/mgen/10.1099/mgen.0.000933
Loading

Data & Media loading...

Supplements

Supplementary material 1

PDF

Supplementary material 2

EXCEL

Supplementary material 3

EXCEL

Supplementary material 4

PDF

Supplementary material 5

PDF
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error