Evaluation of the accuracy of imputed sequence variant genotypes and their utility for causal variant detection in cattle
dc.contributor.author
Pausch, Hubert
dc.contributor.author
Macleod, Iona
dc.contributor.author
Fries, Ruedi
dc.contributor.author
Emmerling, Reiner
dc.contributor.author
Bowman, Phil J.
dc.contributor.author
Daetwyler, Hans D.
dc.contributor.author
Goddard, Michael E.
dc.date.accessioned
2017-06-22T12:12:30Z
dc.date.available
2017-06-19T12:12:27Z
dc.date.available
2017-06-22T12:08:56Z
dc.date.available
2017-06-22T12:12:30Z
dc.date.issued
2017-02-21
dc.identifier.issn
0999-193X
dc.identifier.issn
1297-9686
dc.identifier.other
10.1186/s12711-017-0301-x
en_US
dc.identifier.uri
http://hdl.handle.net/20.500.11850/161490
dc.identifier.doi
10.3929/ethz-b-000161490
dc.description.abstract
Background
The availability of dense genotypes and whole-genome sequence variants from various sources offers the opportunity to compile large datasets consisting of tens of thousands of individuals with genotypes at millions of polymorphic sites that may enhance the power of genomic analyses. The imputation of missing genotypes ensures that all individuals have genotypes for a shared set of variants. Results We evaluated the accuracy of imputation from dense genotypes to whole-genome sequence variants in 249 Fleckvieh and 450 Holstein cattle using Minimac and FImpute. The sequence variants of a subset of the animals were reduced to the variants that were included on the Illumina BovineHD genotyping array and subsequently inferred in silico using either within- or multi-breed reference populations. The accuracy of imputation varied considerably across chromosomes and dropped at regions where the bovine genome contains segmental duplications. Depending on the imputation strategy, the correlation between imputed and true genotypes ranged from 0.898 to 0.952. The accuracy of imputation was higher with Minimac than FImpute particularly for variants with a low minor allele frequency. Using a multi-breed reference population increased the accuracy of imputation, particularly when FImpute was used to infer genotypes. When the sequence variants were imputed using Minimac, the true genotypes were more correlated to predicted allele dosages than best-guess genotypes. The computing costs to impute 23,256,743 sequence variants in 6958 animals were ten-fold higher with Minimac than FImpute. Association studies with imputed sequence variants revealed seven quantitative trait loci (QTL) for milk fat percentage. Two causal mutations in the DGAT1 and GHR genes were the most significantly associated variants at two QTL on chromosomes 14 and 20 when Minimac was used to infer genotypes.
Conclusions
The population-based imputation of millions of sequence variants in large cohorts is computationally feasible and provides accurate genotypes. However, the accuracy of imputation is low in regions where the genome contains large segmental duplications or the coverage with array-derived single nucleotide polymorphisms is poor. Using a reference population that includes individuals from many breeds increases the accuracy of imputation particularly at low-frequency variants. Considering allele dosages rather than best-guess genotypes as explanatory variables is advantageous to detect causal mutations in association studies with imputed sequence variants.
en_US
dc.format
application/pdf
dc.language.iso
en
en_US
dc.publisher
EDP Sciences
en_US
dc.rights.uri
http://creativecommons.org/licenses/by/4.0/
dc.title
Evaluation of the accuracy of imputed sequence variant genotypes and their utility for causal variant detection in cattle
en_US
dc.type
Journal Article
dc.rights.license
Creative Commons Attribution 4.0 International
dc.date.published
2017-02-21
ethz.journal.title
Genetics Selection Evolution
ethz.journal.volume
49
en_US
ethz.journal.abbreviated
Genet. sel. evol.
ethz.pages.start
24
en_US
ethz.size
14 p.
en_US
ethz.version.deposit
publishedVersion
en_US
ethz.code.ddc
DDC - DDC::6 - Technology, medicine and applied sciences::630 - Agriculture
en_US
ethz.code.ddc
DDC - DDC::5 - Science::590 - Zoological sciences
en_US
ethz.publication.place
Les Ulis
ethz.publication.status
published
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02350 - Dep. Umweltsystemwissenschaften / Dep. of Environmental Systems Science::02703 - Institut für Agrarwissenschaften / Institute of Agricultural Sciences
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02350 - Dep. Umweltsystemwissenschaften / Dep. of Environmental Systems Science::02703 - Institut für Agrarwissenschaften / Institute of Agricultural Sciences::09575 - Pausch, Hubert / Pausch, Hubert
ethz.leitzahl.certified
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02350 - Dep. Umweltsystemwissenschaften / Dep. of Environmental Systems Science::02703 - Institut für Agrarwissenschaften / Institute of Agricultural Sciences::09575 - Pausch, Hubert / Pausch, Hubert
ethz.date.deposited
2017-06-19T12:12:30Z
ethz.source
BATCH
ethz.eth
yes
en_US
ethz.availability
Open access
en_US
ethz.rosetta.installDate
2017-06-22T12:09:00Z
ethz.rosetta.lastUpdated
2024-02-02T02:05:16Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Evaluation%20of%20the%20accuracy%20of%20imputed%20sequence%20variant%20genotypes%20and%20their%20utility%20for%20causal%20variant%20detection%20in%20cattle&rft.jtitle=Genetics%20Selection%20Evolution&rft.date=2017-02-21&rft.volume=49&rft.spage=24&rft.issn=0999-193X&1297-9686&rft.au=Pausch,%20Hubert&Macleod,%20Iona&Fries,%20Ruedi&Emmerling,%20Reiner&Bowman,%20Phil%20J.&rft.genre=article&rft_id=info:doi/10.1186/s12711-017-0301-x&
Files in this item
Publication type
-
Journal Article [133720]