This benchmarking workflow can be used to assess performance of germline variant calling pipelines run on Genome in a bottle (GIAB) samples HG002, HG003, and HG004. It is an implementation of the GA4GH best practices that uses vcfeval as the comparison engine and 'genotype match' to calculate true positive, false positives, and false negatives. Additional statistics are generated for partial matches.
The workflow allows for selection of 'competitors' and 'regions'. Competitor selection allows you to compare your VCF against the winning VCF submissions for the PrecisionFDA Truth Challenge V2. Region selection allows you to specify which genomic regions are used for the comparison.
The workflow generates two html reports: a multiQC report generated using Bcftools and a notebook that contains upset plots, precision recall metrics, and indel size distribution plots generated using papermill.
Metrics from each benchmarking run are also loaded into the Truwl Performance Metrics table which enables you to compare metrics across multiple benchmarking runs.
Truwl has put all submission VCF's to the PrecisionFDA Truth Challenge V2 in a publicly accessible bucket. The uri's for these files can be used as query VCFs for testing this workflow.
Running the benchmarking part of this workflow is not computationally expensive and costs ~$1/run. Comparing to other VCF's (competitors) can significantly raise the cost. e.g. Cost for jobs with 8 competitors can be $10 or more. As always, it's a good idea to run a test job to get an idea of costs before running many samples.