Aligning genome wide coding sequences of two species for Ka/Ks calculation
0
1
Entering edit mode
8.1 years ago

Hi,

I have two fasta files each containing ~300,000 sequences of coding exons for one species. Both species are mapped to the same reference and hence, have same co-ordinates (synteny cannot be detected) example: File 1: >Rnor_chr1:268298874-268299113 TTACGCACACGGGGCACAGCCGCACTTGGTGGGCTTCT.....

File 2: >Rrat_chr1:268298874-268299113 TTACGCACACGGGGCACAGCCGCACTTGGTGGGCTTCTCCACCTC....

I want to align each of the entries and then calculate Ka/Ks using PAML. Basically, pairwise alignment of 1000s of pairs. Looking at Muscle or Prank, I get the idea that they require an input file which contains both sequences to be aligned from the two species. That would mean, I need 300,000 input fasta files? Can anybody give me an idea how this can be solved?

Thank you.

alignment next-gen • 1.7k views
ADD COMMENT

Login before adding your answer.

Traffic: 1861 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6