Question: How to extract fasta sequences from assembled transcripts generated by Stringtie
1
gravatar for seta
22 months ago by
seta1.1k
Sweden
seta1.1k wrote:

Hi all,

I used STAR and stringtie for mapping reads to reference genome and assembly. As you know, the generated assembled transcripts by stringtie are in gtf format. Now, I want to have fasta sequence of assembled transcript. I used gffread, but all sequences had the same header! maybe it's not compatible with stringtie. Could you please help me out to convert assembled transcripts by stringtie in gtf format to fasta format?

Thanks

ADD COMMENTlink modified 18 months ago by bioExplorer3.7k • written 22 months ago by seta1.1k
2
gravatar for zzqr
18 months ago by
zzqr20
zzqr20 wrote:

The stringtie_merged.gtf file have seqname, start, end strand info. So, you can use R GRanges object and getSeq function from GenomicRanges and BSgenome packages to retrive sequences.

ADD COMMENTlink written 18 months ago by zzqr20
1
gravatar for bioExplorer
18 months ago by
bioExplorer3.7k
bioExplorer3.7k wrote:

You can also use bedtools getfasta to fetch sequences from GTF or BED files.


UPDATE

Here is the perfect solution

ADD COMMENTlink modified 16 months ago • written 18 months ago by bioExplorer3.7k

I used this, but I run into the following error

"Error (GFaSeqGet): subsequence cannot be larger than 465 Error getting subseq for gene1 (465..1503)!"

Did you had any issues using gffread?

Thanks

ADD REPLYlink written 5 months ago by spriyar10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1625 users visited in the last hour