Identifying barley pan-genome sequence anchors using genetic mapping and machine learning
Journal article
Gao, Shang, Wu, Ryan, Stiller, Jiri, Zheng, Zhi, Zhou, Meixue, Wang, You-Gan and Liu, Chunji. (2020). Identifying barley pan-genome sequence anchors using genetic mapping and machine learning. Theoretical and Applied Genetics. 133(9), pp. 2535-2544. https://doi.org/10.1007/s00122-020-03615-y
Authors | Gao, Shang, Wu, Ryan, Stiller, Jiri, Zheng, Zhi, Zhou, Meixue, Wang, You-Gan and Liu, Chunji |
---|---|
Abstract | We identified 1.844 million barley pan-genome sequence anchors from 12,306 genotypes using genetic mapping and machine learning. There is increasing evidence that genes from a given crop genotype are far to cover all genes in that species; thus, building more comprehensive pan-genomes is of great importance in genetic research and breeding. Obtaining a thousand-genotype scale pan-genome using deep-sequencing data is currently impractical for species like barley which has a huge and highly repetitive genome. To this end, we attempted to identify barley pan-genome sequence anchors from a large quantity of genotype-by-sequencing (GBS) datasets by combining genetic mapping and machine learning algorithms. Based on the GBS sequences from 11,166 domesticated and 1140 wild barley genotypes, we identified 1.844 million pan-genome sequence anchors. Of them, 532,253 were identified as presence/absence variation (PAV) tags. Through aligning these PAV tags to the genome of hulless barley genotype Zangqing320, our analysis resulted in a validation of 83.6% of them from the domesticated genotypes and 88.6% from the wild barley genotypes. Association analyses against flowering time, plant height and kernel size showed that the relative importance of the PAV and non-PAV tags varied for different traits. The pan-genome sequence anchors based on GBS tags can facilitate the construction of a comprehensive pan-genome and greatly assist various genetic studies including identification of structural variation, genetic mapping and breeding in barley. |
Keywords | Algorithms; Chromosome Mapping; Plant Genome; Genotype; Hordeum genetics; Linkage Disequilibrium; Machine Learning |
Year | 01 Jan 2020 |
Journal | Theoretical and Applied Genetics |
Journal citation | 133 (9), pp. 2535-2544 |
Publisher | Springer |
ISSN | 0040-5752 |
Digital Object Identifier (DOI) | https://doi.org/10.1007/s00122-020-03615-y |
PubMed ID | 32448920 |
Web address (URL) | https://link.springer.com/article/10.1007/s00122-020-03615-y |
Open access | Published as non-open access |
Research or scholarly | Research |
Page range | 2535-2544 |
Publisher's version | License All rights reserved File Access Level Controlled |
Output status | Published |
Publication dates | |
24 May 2020 | |
Publication process dates | |
Accepted | 16 May 2020 |
Deposited | 11 Jan 2023 |
Additional information | © Springer-Verlag GmbH Germany, part of Springer Nature 2020. |
Place of publication | Germany |
https://acuresearchbank.acu.edu.au/item/8y92q/identifying-barley-pan-genome-sequence-anchors-using-genetic-mapping-and-machine-learning
Restricted files
Publisher's version
79
total views0
total downloads0
views this month0
downloads this month