I am trying to extract the CDS sequences from a genome in fasta format. I have a gtf file.
Googleing around it seems that the gffread command in the cufflinks package can do this.
However when I try to implement it I just get blank files.
Following the documentation I tried this:
gffread transcripts.gtf -x extracted_transcripts.fa -g genome.fasta
but I just got a file with nothing in it
then because most cufflinks programs seem to like having the options first and inputs second I tried this:
gffread -x extracted_transcripts.fa -g genome.fasta transcripts.gtf
same result. What am I missing?
This works for me:
gffread -g genome.fa -x CDS.fa annotation.gtf
make sure you have a valid gtf / fa / fasta file and they are using the same format for chromosomes. It would help if you paste in the top couple lines of each file.