How to extract Gene sequence from a fasta file using Cufflink ID?
1
0
Entering edit mode
6.4 years ago
AP ▴ 80

Hello everyone,

I am trying to give function to differentially expressed genes across different samples. But now I have problem extracting fasta sequence for some differentially expressed genes.

I had a gtf file from augustus which I tried to merge it to gtf files generated by cufflink using cuffmerge. Now I realized that with some of the expressed genes I don't have gene name like g123, g6352, ect, but I have IDs like CUFF 2.1, CUFF 12.1, etc.

I already extracted fasta sequence from the genes having appropriate gene names. However I don't know how to extract sequence from other genes using cufflink IDs. Is there any solution to my problem. I will really appreciate your help.

Thank you,

Ambika

RNA-Seq sequence gene • 2.2k views
ADD COMMENT
0
Entering edit mode

Have you tried gffread or fastafrombed. Gffread has options to extract only coding region or proteins many more from gff/gtf

ADD REPLY
0
Entering edit mode

No I haven't tried that. The problem is is my annotation file I don't have those cufflink Ids.

ADD REPLY
0
Entering edit mode

You can also use gtf_to_fasta (provided by tophat package, available in ubuntu repos), if you have gtf file..

ADD REPLY
0
Entering edit mode
6.4 years ago

Do you have the locations for the sequences you want? If yes, then gffread is the answer

ADD COMMENT

Login before adding your answer.

Traffic: 2706 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6