GenoCore : a simple and fast algorithm for core subset selection from large genotype datasets

Cited 49 time in scopus
Metadata Downloads

Full metadata record

DC FieldValueLanguage
dc.contributor.authorSeongmun Jeong-
dc.contributor.authorJae-Yoon Kim-
dc.contributor.authorSoon-Chun Jeong-
dc.contributor.authorS T Kang-
dc.contributor.authorJ K Moon-
dc.contributor.authorNamshin Kim-
dc.date.accessioned2017-08-29-
dc.date.available2017-08-29-
dc.date.issued2017-
dc.identifier.issn1932-6203-
dc.identifier.uri10.1371/journal.pone.0181420ko
dc.identifier.urihttps://oak.kribb.re.kr/handle/201005/17266-
dc.description.abstractSelecting core subsets from plant genotype datasets is important for enhancing cost-effectiveness and to shorten the time required for analyses of genome-wide association studies (GWAS), and genomics-assisted breeding of crop species, etc. Recently, a large number of genetic markers (>100,000 single nucleotide polymorphisms) have been identified from high-density single nucleotide polymorphism (SNP) arrays and next-generation sequencing (NGS) data. However, there is no software available for picking out the efficient and consistent core subset from such a huge dataset. It is necessary to develop software that can extract genetically important samples in a population with coherence. We here present a new program, GenoCore, which can find quickly and efficiently the core subset representing the entire population. We introduce simple measures of coverage and diversity scores, which reflect genotype errors and genetic variations, and can help to select a sample rapidly and accurately for crop genotype dataset. Comparison of our method to other core collection software using example datasets are performed to validate the performance according to genetic distance, diversity, coverage, required system resources, and the number of selected samples. GenoCore selects the smallest, most consistent, and most representative core collection from all samples, using less memory with more efficient scores, and shows greater genetic coverage compared to the other software tested. GenoCore was written in R language, and can be accessed online with an example dataset and test results at https://github.com/lovemun/Genocore.-
dc.publisherPublic Library of Science-
dc.titleGenoCore : a simple and fast algorithm for core subset selection from large genotype datasets-
dc.title.alternativeGenoCore : a simple and fast algorithm for core subset selection from large genotype datasets-
dc.typeArticle-
dc.citation.titlePLoS One-
dc.citation.number7-
dc.citation.endPagee0181420-
dc.citation.startPagee0181420-
dc.citation.volume12-
dc.contributor.affiliatedAuthorSeongmun Jeong-
dc.contributor.affiliatedAuthorJae-Yoon Kim-
dc.contributor.affiliatedAuthorSoon-Chun Jeong-
dc.contributor.affiliatedAuthorNamshin Kim-
dc.contributor.alternativeName정성문-
dc.contributor.alternativeName김재윤-
dc.contributor.alternativeName정순천-
dc.contributor.alternativeName강성택-
dc.contributor.alternativeName문중경-
dc.contributor.alternativeName김남신-
dc.identifier.bibliographicCitationPLoS One, vol. 12, no. 7, pp. e0181420-e0181420-
dc.identifier.doi10.1371/journal.pone.0181420-
dc.description.journalClassY-
Appears in Collections:
Division of A.I. & Biomedical Research > Genomic Medicine Research Center > 1. Journal Articles
Ochang Branch Institute > 1. Journal Articles
Files in This Item:

Items in OpenAccess@KRIBB are protected by copyright, with all rights reserved, unless otherwise indicated.