Question: GTF to FASTA including stop codon
gravatar for athey.johnc
4.0 years ago by
United States
athey.johnc40 wrote:

I am looking for a way to extract coding sequences, including their stop codons where available, using a genome and GTF annotations. None of the tools I've come across, including tophat's gtf_to_fasta and gffread, actually parse out the full sequence with the stop codon. Gffread has a flag -J, "discard any mRNAs that either lack initial START codon or the terminal STOP codon, or have an in-frame stop codon (only print mRNAs with a fulll, valid CDS)", which causes it to write out the sequence with the stop codon, but has the disadvantage of excluding any partial sequences that may not have a stop specified. Feeding these tools a GTF file with only CDS entries produces just the amino-acid encoding part of the sequence (no stop codon), but feeding them a GTF with CDS and stop codon rows (except gffread -J) causes them to write out the coding sequence and the first nucleotide of the stop codon (which I don't understand either). Are there other tools available that could do what I need?

gffread tophat fasta gtf • 1.6k views
ADD COMMENTlink written 4.0 years ago by athey.johnc40
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1525 users visited in the last hour