Use of SNP genotypes to identify carriers of harmful recessive mutations in cattle populations
Open in viewer
Nicolazzi, Ezequiel L.
- Journal Article
Open in viewer
Rights / licenseCreative Commons Attribution 4.0 International
Background SNP (single nucleotide polymorphisms) genotype data are increasingly available in cattle populations and, among other things, can be used to predict carriers of specific mutations. It is therefore convenient to have a practical statistical method for the accurate classification of individuals into carriers and non-carriers. In this paper, we compared – through cross-validation– five classification models (Lasso-penalized logistic regression –Lasso, Support Vector Machines with either linear or radial kernel –SVML and SVMR, k-nearest neighbors –KNN, and multi-allelic gene prediction –MAG), for the identification of carriers of the TUBD1 recessive mutation on BTA19 (Bos taurus autosome 19), known to be associated with high calf mortality. A population of 3116 Fleckvieh and 392 Brown Swiss animals genotyped with the 54K SNP-chip was available for the analysis. Results In general, the use of SNP genotypes proved to be very effective for the identification of mutation carriers. The best predictive models were Lasso, SVML and MAG, with an average error rate, respectively, of 0.2 %, 0.4 % and 0.6 % in Fleckvieh, and 1.2 %, 0.9 % and 1.7 % in Brown Swiss. For the three models, the false positive rate was, respectively, 0.1 %, 0.1 % and 0.2 % in Fleckvieh, and 3.0 %, 2.4 % and 1.6 % in Brown Swiss; the false negative rate was 4.4 %, 7.6 %1.0 % in Fleckvieh, and 0.0 %, 0.1% and 0.8 % in Brown Swiss. MAG appeared to be more robust to sample size reduction: with 25 % of the data, the average error rate was 0.7 % and 2.2 % in Fleckvieh and Brown Swiss, compared to 2.1 % and 5.5 % with Lasso, and 2.6 % and 12.0 % with SVML. Conclusions The use of SNP genotypes is a very effective and efficient technique for the identification of mutation carriers in cattle populations. Very few misclassifications were observed, overall and both in the carriers and non-carriers classes. This indicates that this is a very reliable approach for potential applications in cattle breeding Show more
Journal / seriesBMC Genomics
Pages / Article No.
SubjectSNP genotypes; Recessive mutations; Carrier identification; Lasso-penalised logistic regression; Support vector machines; KNN; MAG; Haplotypes; Cattle
Organisational unit02703 - Institut für Agrarwissenschaften / Institute of Agricultural Sciences
09575 - Pausch, Hubert / Pausch, Hubert
MoreShow all metadata