Hi,
I have two fasta files each containing ~300,000 sequences of coding exons for one species. Both species are mapped to the same reference and hence, have same co-ordinates (synteny cannot be detected) example: File 1: >Rnor_chr1:268298874-268299113 TTACGCACACGGGGCACAGCCGCACTTGGTGGGCTTCT.....
File 2: >Rrat_chr1:268298874-268299113 TTACGCACACGGGGCACAGCCGCACTTGGTGGGCTTCTCCACCTC....
I want to align each of the entries and then calculate Ka/Ks using PAML. Basically, pairwise alignment of 1000s of pairs. Looking at Muscle or Prank, I get the idea that they require an input file which contains both sequences to be aligned from the two species. That would mean, I need 300,000 input fasta files? Can anybody give me an idea how this can be solved?
Thank you.