How to extract the CDS from a fasta file using the protein IDS ?
0
0
Entering edit mode
22 months ago
sunnykevin97 ▴ 980

Hi,

Finally, end up with good amount of orthologs (1500).

I'd like to extract the CDS of the protein sequences ?

Given a IDS.protein.txt file, I'd like search against the CDS sequences of all the 10 species.

For example -

cat  IDS.protein.txt

>g66422.t1
>XP_034399799.1
>g40125.t1
>g66683.t1
>g53726.t1
>g25019.t1
>g26815.t1
>LipSG014613.t1
>hadal41775
>evm.model.Contig3137_pilon.3_pasa2.longest.filter_rm

I'm interested in searching the 1st ID against the 1.fasta file, and 2nd ID against 2.fasta file

extracting ---

>g66422.t1 (1st ID) vs. **1.fasta**
>XP_034399799.1 (2nd ID) vs. **2.fasta**
....
....
....
....
>evm.model.Contig3137_pilon.3_pasa2.longest.filter_rm (10th ID) vs. **10.fasta**

Some suggestions please.

gene genome protein • 316 views
ADD COMMENT

Login before adding your answer.

Traffic: 1768 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6