How to map the headers of a multiple sequence alignment of proteins file to their CDS counterparts from multiple files.

0

Entering edit mode

2.7 years ago

Rijan • 0

I have files with multiple sequence alignment of proteins in the following format for any given file:

Species one, gene x

(Protein sequence of gene x)

Species two, gene gene y

(Protein sequence of gene y)

Species three, gene z

(Protein sequence of gene z)

Now I also have whole CDS files for all the species involved:

so, species_one_cds.fa, species_two_cds.fa, species_three_cds.fa

I need something that can read the headers in the multiple sequence alignment of proteins and detect those headers in the CDS fasta files and generate a cds equivalent of the protein equivalent. So something like the following as the final product:

Species one, gene x

(CDS sequence of gene x)

Species two, gene gene y

(CDS sequence of gene y)

Species three, gene z

(CDS sequence of gene z)

Is there a software package that can do something like this?

mutiple_alignment peptide_files cds_files • 442 views

ADD COMMENT • link 2.7 years ago by Rijan • 0

Login before adding your answer.