Closha 2.0: a bio-workflow design system for massive genome data analysis on high performance cluster infrastructure

Cited 0 time in scopus
Metadata Downloads
Title
Closha 2.0: a bio-workflow design system for massive genome data analysis on high performance cluster infrastructure
Author(s)
Gunhwan KoPan-Gyu KimByung-Ha Yoon; JaeHee Kim; Wangho SongIkSu ByeonJongCheol YoonByungwook Lee; Y K Kim
Bibliographic Citation
BMC Bioinformatics, vol. 25, pp. 353-353
Publication Year
2024
Abstract
Background: The explosive growth of next-generation sequencing data has resulted in ultra-large-scale datasets and significant computational challenges. As the cost of next-generation sequencing (NGS) has decreased, the amount of genomic data has surged globally. However, the cost and complexity of the computational resources required continue to be substantial barriers to leveraging big data. A promising solution to these computational challenges is cloud computing, which provides researchers with the necessary CPUs, memory, storage, and software tools. Results: Here, we present Closha 2.0, a cloud computing service that offers a user-friendly platform for analyzing massive genomic datasets. Closha 2.0 is designed to provide a cloud-based environment that enables all genomic researchers, including those with limited or no programming experience, to easily analyze their genomic data. The new 2.0 version of Closha has more user-friendly features than the previous 1.0 version. Firstly, the workbench features a script editor that supports Python, R, and shell script programming, enabling users to write scripts and integrate them into their pipelines. This functionality is particularly useful for downstream analysis. Second, Closha 2.0 runs on containers, which execute each tool in an independent environment. This provides a stable environment and prevents dependency issues and version conflicts among tools. Additionally, users can execute each step of a pipeline individually, allowing them to test applications at each stage and adjust parameters to achieve the desired results. We also updated a high-speed data transmission tool called GBox that facilitates the rapid transfer of large datasets. Conclusions: The analysis pipelines on Closha 2.0 are reproducible, with all analysis parameters and inputs being permanently recorded. Closha 2.0 simplifies multi-step analysis with drag-and-drop functionality and provides a user-friendly interface for genomic scientists to obtain accurate results from NGS data. Closha 2.0 is freely available at https://www.kobic.re.kr/closha2 .
Keyword
Closha 2.0Next?generation sequencing (NGS)Cloud computingBioinformatics workflowHigh?performance computing (HPC)Genomic data analysisUser?friendly interfaceData transmission (GBox)Single?cell RNA sequencing (scRNASeq)
ISSN
1471-2105
Publisher
Springer-BMC
Full Text Link
http://dx.doi.org/10.1186/s12859-024-05963-8
Type
Article
Appears in Collections:
1. Journal Articles > Journal Articles
Files in This Item:
  • There are no files associated with this item.


Items in OpenAccess@KRIBB are protected by copyright, with all rights reserved, unless otherwise indicated.