Cited 0 time in
- Title
- Gene sequences clustering for the prediction of functional domain = 기능 도메인 예측을 위한 유전자 서열 클러스터링
- Author(s)
- S I Han; S G Lee; Bo Kyeng Hou; Y S Byun; K S Hwang
- Bibliographic Citation
- Journal of Control, Automation, and Systems Engineering, vol. 12, no. 10, pp. 1044-1049
- Publication Year
- 2006
- Abstract
- Multiple sequence alignment is a method for comparing two or more DNA or protein sequences. Most multiple sequence alignment methods rely on pairwise alignment and Smith-Waterman algorithm [Needleman and Wunsch, 1970; Smith and Waterman, 1981] to generate an alignment hierarchy. Therefore, as the number of sequences increases, the runtime increases exponentially. To resolve this problem, this paper presents a multiple sequence alignment method using a parallel processing suffix tree algorithm to search for common subsequences at one time without pairwise alignment. The cross-matched subsequences among the searched common subsequences may be generated and those cause inexact-matching. So the procedure of masking cross-matching pairs was suggested in this study. The proposed method, improved STC (Suffix Tree Clustering), is summarized as follows: (1) construction of suffix tree; (2) search and overlap of common subsequences; (3) grouping of subsequence pairs; (4) masking of cross-matching pairs; and (5) clustering of gene sequences. The new method was successfully evaluated with 23 genes in Mus musculus and 22 genes in three species, clustering nine and eight clusters, respectively.
- Keyword
- clusteringgenemultiple sequence alignmentsequencesuffix treeBLASTdomain
- ISSN
- 1225-9845
- Publisher
- Korea Soc-Assoc-Inst
- Type
- Article
- Appears in Collections:
- 1. Journal Articles > Journal Articles
- Files in This Item:
Items in OpenAccess@KRIBB are protected by copyright, with all rights reserved, unless otherwise indicated.