KoNA: Korean nucleotide archive as a new data repository for nucleotide sequence data

Cited 4 time in scopus
Metadata Downloads
Title
KoNA: Korean nucleotide archive as a new data repository for nucleotide sequence data
Author(s)
Gunhwan Ko; Jae Ho Lee; Young Mi Sim; Wangho SongByung-Ha YoonIksu ByeonBang Hyuck LeeSang-Ok KimJinhyuk ChoiInsoo JangHyerin KimJin Ok YangKiwon Jang; Sora Kim; Jong-Hwan KimJongbum JeonJaeeun JungSeungwoo Hwang; Ji Hwan Park; Pan-Gyu KimSeon-Young KimByungwook Lee
Bibliographic Citation
Genomics, Proteomics & Bioinformatics, vol. 22, pp. qzae017-qzae017
Publication Year
2024
Abstract
During the last decade, the generation and accumulation of petabase-scale high-throughput sequencing data have resulted in great challenges, including access to human data, as well as transfer, storage, and sharing of enormous amounts of data. To promote data-driven biological research, the Korean government announced that all biological data generated from government-funded research projects should be deposited at the Korea BioData Station (K-BDS), which consists of multiple databases for individual data types. Here, we introduce the Korean Nucleotide Archive (KoNA), a repository of nucleotide sequence data. As of July 2022, the Korean Read Archive in KoNA has collected over 477 TB of raw next-generation sequencing data from national genome projects. To ensure data quality and prepare for international alignment, a standard operating procedure was adopted, which is similar to that of the International Nucleotide Sequence Database Collaboration. The standard operating procedure includes quality control processes for submitted data and metadata using an automated pipeline, followed by manual examination. To ensure fast and stable data transfer, a high-speed transmission system called GBox is used in KoNA. Furthermore, the data uploaded to or downloaded from KoNA through GBox can be readily processed using a cloud computing service called Bio-Express. This seamless coupling of KoNA, GBox, and Bio-Express enhances the data experience, including submission, access, and analysis of raw nucleotide sequences. KoNA not only satisfies the unmet needs for a national sequence repository in Korea but also provides datasets to researchers globally and contributes to advances in genomics. The KoNA is available at https://www.kobic.re.kr/kona/.
Keyword
Korea BioData StationNucleotide sequenceNext-generation sequencing repositoryGenomicsDeposition and access of big data
ISSN
1672-0229
Publisher
Oxford Univ Press
Full Text Link
http://dx.doi.org/10.1093/gpbjnl/qzae017
Type
Article
Appears in Collections:
Division of A.I. & Biomedical Research > Genomic Medicine Research Center > 1. Journal Articles
Files in This Item:
  • There are no files associated with this item.


Items in OpenAccess@KRIBB are protected by copyright, with all rights reserved, unless otherwise indicated.