Multi-platform discovery of haplotype-resolved structural variation in human genomes
Chaisson, Mark J.P.
Sanders, Ashley D.
Nelson, Bradley J.
- Working Paper
Rights / licenseCreative Commons Attribution-NoDerivatives 4.0 International
The incomplete identification of structural variants from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long- and short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three human parent-child trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,181 indel variants (<50 bp) and 31,599 structural variants (≥50 bp) per human genome, a seven fold increase in structural variation compared to previous reports, including from the 1000 Genomes Project. We also discovered 156 inversions per genome, most of which previously escaped detection, as well as large unbalanced chromosomal rearrangements. We provide near-complete, haplotype-resolved structural variation for three genomes that can now be used as a gold standard for the scientific community and we make specific recommendations for maximizing structural variation sensitivity for future large-scale genome sequencing studies Show more
Journal / seriesbioRxiv
Pages / Article No.
PublisherCold Spring Harbor Laboratory
Organisational unit03627 - Nelson, Bradley J.
MoreShow all metadata