gffread is not working
1
4
Entering edit mode
9.9 years ago
gtho123 ▴ 260

I am trying to extract the CDS sequences from a genome in fasta format. I have a gtf file.

Googleing around it seems that the gffread command in the cufflinks package can do this.

However when I try to implement it I just get blank files.

Following the documentation I tried this:

gffread transcripts.gtf -x extracted_transcripts.fa -g genome.fasta

but I just got a file with nothing in it

then because most cufflinks programs seem to like having the options first and inputs second I tried this:

gffread -x extracted_transcripts.fa -g genome.fasta transcripts.gtf

same result. What am I missing?

software-error next-gen • 9.8k views
ADD COMMENT
1
Entering edit mode

This works for me: gffread -g genome.fa -x CDS.fa annotation.gtf

ADD REPLY
0
Entering edit mode

make sure you have a valid gtf / fa / fasta file and they are using the same format for chromosomes. It would help if you paste in the top couple lines of each file.

ADD REPLY
0
Entering edit mode
9.9 years ago
gtho123 ▴ 260

I am mistaken. I assumed that because my *.gtf file was produced by cufflinks I could use gffread -x as gffread is part of the same suite of programs as cufflinks. This is not the case as cufflinks does not identify CDS and does not go above the transcript level.

Following this How To Extract Cds And Protein Sequences From Cufflinks Transcripts.Gtf File? I an going to give TransDecoder a go.

ADD COMMENT
0
Entering edit mode

Even with cufflinks out file transcript.gtf, -w option should work.

gffread transcripts.gtf -g genome.fa -o transcripts.gff -w transcripts.fa
ADD REPLY

Login before adding your answer.

Traffic: 2618 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6