Authors: Amit Shukla and Suneeta Agarwal
The Longest common subsequence problem is to find a subsequence that is common to two or more given sequences and is the longest one of such subsequences. This problem of searching the longest common subsequence (LCS) from multiple biological sequences is one of the most fundamental tasks in bioinformatics. Earlier Rizvi and Agarwal had proposed an algorithm for it, but we found out that their algorithm gives wrong results in some cases. We have proposed an algorithm towards correcting the errors. In this paper, we present a parallel algorithm named NEW_LCS_ALGO to find the LCS from any number of given DNA, RNA, Protein or general sequences. Here we give examples of DNA sequences, although this can be used for RNA, Protein or general sequences (provided we know the characters being used). The speed up in our LCS Algorithm is achieved through pruning operations, in which we reject all those nucleotides which cannot generate the next character of the LCS thus reducing the search space and accelerating the search speed.