precisionFDA Truth Challenge V2 input fastqs

Version:
1.0
Identifier: FC_ac361c.1
File

Description


~35X Illumina, ~35X PacBio HiFi, and ~50X Oxford Nanopore Technologies fastq files provided by the Genome in a Bottle consortium to challenge participants for subjects HG002, HG003, and HG004 used for variant calling for [precisionFDA Truth Challenge V2](https://data.nist.gov/od/id/mds2-2336) Illumina, PacBio, and ONT sequencing data for the Genome In A Bottle Ashkenazim Jewish trio (son - HG002, father - HG003, and mother - HG004). The samples were sequenced under similar sequencing conditions and instruments across the three genomes. For the Illumina dataset, 2x151 bp high coverage PCR-free library was sequenced on the NovaSeq 6000 System (manuscript in-prep). For PacBio HiFi, we used the library size and coverage recommended at the time by PacBio for variant calling, ~35X 15 kb libraries. For HG002, 4 SMRT Cells were sequenced using the Sequel II System with 2.0 chemistry. Consensus basecalling was performed using the Circular Consensus Sequencing analysis in SMRT Link v8.0, ccs version 4.0.0. Data from the 15 kb library SMRT Cells were merged and downsampled to 35X. The ONT dataset was generated using the unsheared DNA library prep, methods described elsewhere (Shafin et al., 2020), and consisted of pooled sequencing data from three PromethION R9.4 flowcells. Basecalling was performed using Guppy Version 3.6 (https://community.nanoporetech.com). Data from three ONT PromethION flow cells were used for each of the 3 genomes, but the resulting coverage was substantially higher for the parents (85X) than the child (47X) with similar read length distributions.

File Locations

  • data.nist.gov

    https://data.nist.gov/od/id/mds2-2336

  • cloud.google.com

    https://console.cloud.google.com/storage/browser/truth-challenge-v2/input_fastqs

Child Files

Tags

No Tags found