Show simple item record

dc.contributor.author
Greminger, Maja P.
dc.contributor.author
Stölting, Kai N.
dc.contributor.author
Nater, Alexander
dc.contributor.author
Goossens, Benoit
dc.contributor.author
Natasha, Arora
dc.contributor.author
Bruggmann, Rémy
dc.contributor.author
Patrignani, Andrea
dc.contributor.author
Nussberger, Beatrice
dc.contributor.author
Sharma, Reeta
dc.contributor.author
Kraus, Robert H.S.
dc.contributor.author
Ambu, Laurentius N.
dc.contributor.author
Singleton, Ian
dc.contributor.author
Chikhi, Lounes
dc.contributor.author
van Schaik, Carel P.
dc.contributor.author
Krützen, Michael
dc.date.accessioned
2018-10-08T15:46:38Z
dc.date.available
2017-06-11T15:55:25Z
dc.date.available
2017-11-30T16:46:03Z
dc.date.available
2018-10-08T15:46:38Z
dc.date.issued
2014-01
dc.identifier.other
10.1186/1471-2164-15-16
en_US
dc.identifier.uri
http://hdl.handle.net/20.500.11850/97547
dc.identifier.doi
10.3929/ethz-b-000097547
dc.description.abstract
Background High-throughput sequencing has opened up exciting possibilities in population and conservation genetics by enabling the assessment of genetic variation at genome-wide scales. One approach to reduce genome complexity, i.e. investigating only parts of the genome, is reduced-representation library (RRL) sequencing. Like similar approaches, RRL sequencing reduces ascertainment bias due to simultaneous discovery and genotyping of single-nucleotide polymorphisms (SNPs) and does not require reference genomes. Yet, generating such datasets remains challenging due to laboratory and bioinformatical issues. In the laboratory, current protocols require improvements with regards to sequencing homologous fragments to reduce the number of missing genotypes. From the bioinformatical perspective, the reliance of most studies on a single SNP caller disregards the possibility that different algorithms may produce disparate SNP datasets. Results We present an improved RRL (iRRL) protocol that maximizes the generation of homologous DNA sequences, thus achieving improved genotyping-by-sequencing efficiency. Our modifications facilitate generation of single-sample libraries, enabling individual genotype assignments instead of pooled-sample analysis. We sequenced ~1% of the orangutan genome with 41-fold median coverage in 31 wild-born individuals from two populations. SNPs and genotypes were called using three different algorithms. We obtained substantially different SNP datasets depending on the SNP caller. Genotype validations revealed that the Unified Genotyper of the Genome Analysis Toolkit and SAMtools performed significantly better than a caller from CLC Genomics Workbench (CLC). Of all conflicting genotype calls, CLC was only correct in 17% of the cases. Furthermore, conflicting genotypes between two algorithms showed a systematic bias in that one caller almost exclusively assigned heterozygotes, while the other one almost exclusively assigned homozygotes. Conclusions Our enhanced iRRL approach greatly facilitates genotyping-by-sequencing and thus direct estimates of allele frequencies. Our direct comparison of three commonly used SNP callers emphasizes the need to question the accuracy of SNP and genotype calling, as we obtained considerably different SNP datasets depending on caller algorithms, sequencing depths and filtering criteria. These differences affected scans for signatures of natural selection, but will also exert undue influences on demographic inferences. This study presents the first effort to generate a population genomic dataset for wild-born orangutans with known population provenance.
en_US
dc.format
application/pdf
en_US
dc.language.iso
en
en_US
dc.publisher
BioMed Central
en_US
dc.rights.uri
http://creativecommons.org/licenses/by/2.0/
dc.subject
Next-generation sequencing
en_US
dc.subject
Single-nucleotide polymorphisms
en_US
dc.subject
Reduced-representation libraries
en_US
dc.subject
Bioinformatics
en_US
dc.subject
GATK
en_US
dc.subject
SAMtools
en_US
dc.subject
CLC genomics workbench
en_US
dc.subject
Great apes
en_US
dc.title
Generation of SNP datasets for orangutan population genomics using improved reduced-representation sequencing and direct comparisons of SNP calling algorithms
en_US
dc.type
Journal Article
dc.rights.license
Creative Commons Attribution 2.0 Generic
ethz.journal.title
BMC Genomics
ethz.journal.volume
15
en_US
ethz.pages.start
16
en_US
ethz.size
15 p.
en_US
ethz.version.deposit
publishedVersion
en_US
ethz.identifier.nebis
004256340
ethz.publication.place
London
en_US
ethz.publication.status
published
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00003 - Schulleitung und Dienste::00022 - Bereich VP Forschung & Wirtschaftsbez. / Domain VP Research & Corporate Relations::02207 - Functional Genomics Center Zürich / Functional Genomics Center Zürich
en_US
ethz.leitzahl.certified
ETH Zürich::00002 - ETH Zürich::00003 - Schulleitung und Dienste::00022 - Bereich VP Forschung & Wirtschaftsbez. / Domain VP Research & Corporate Relations::02207 - Functional Genomics Center Zürich / Functional Genomics Center Zürich
ethz.date.deposited
2017-06-11T15:55:37Z
ethz.source
ECIT
ethz.identifier.importid
imp593652e364cf650111
ethz.ecitpid
pub:152522
ethz.eth
yes
en_US
ethz.availability
Open access
en_US
ethz.rosetta.installDate
2017-07-14T20:33:47Z
ethz.rosetta.lastUpdated
2018-10-08T15:46:41Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Generation%20of%20SNP%20datasets%20for%20orangutan%20population%20genomics%20using%20improved%20reduced-representation%20sequencing%20and%20direct%20comparisons%20of%20S&rft.jtitle=BMC%20Genomics&rft.date=2014-01&rft.volume=15&rft.spage=16&rft.au=Greminger,%20Maja%20P.&St%C3%B6lting,%20Kai%20N.&Nater,%20Alexander&Goossens,%20Benoit&Natasha,%20Arora&rft.genre=article&
 Search via SFX

Files in this item

Thumbnail

Publication type

Show simple item record