Question: How to extract fasta sequences from assembled transcripts generated by Stringtie
gravatar for seta
2.2 years ago by
seta1.2k wrote:

Hi all,

I used STAR and stringtie for mapping reads to reference genome and assembly. As you know, the generated assembled transcripts by stringtie are in gtf format. Now, I want to have fasta sequence of assembled transcript. I used gffread, but all sequences had the same header! maybe it's not compatible with stringtie. Could you please help me out to convert assembled transcripts by stringtie in gtf format to fasta format?


ADD COMMENTlink modified 22 months ago by lakhujanivijay4.2k • written 2.2 years ago by seta1.2k
gravatar for zzqr
22 months ago by
zzqr20 wrote:

The stringtie_merged.gtf file have seqname, start, end strand info. So, you can use R GRanges object and getSeq function from GenomicRanges and BSgenome packages to retrive sequences.

ADD COMMENTlink written 22 months ago by zzqr20
gravatar for lakhujanivijay
22 months ago by
lakhujanivijay4.2k wrote:

You can also use bedtools getfasta to fetch sequences from GTF or BED files.


Here is the perfect solution

ADD COMMENTlink modified 20 months ago • written 22 months ago by lakhujanivijay4.2k

I used this, but I run into the following error

"Error (GFaSeqGet): subsequence cannot be larger than 465 Error getting subseq for gene1 (465..1503)!"

Did you had any issues using gffread?


ADD REPLYlink written 9 months ago by spriyar10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1685 users visited in the last hour