Strand in gff file
1
0
Entering edit mode
11 weeks ago
blz • 0

Hello,

It's a basic question, but what's the meaning of strand in gff file? I mean, when a gene is annotated as in the + strand, the sequence I see in the + strand is the reverse complement of mRNA or is exactly (identical) the mRNA sequence? I'm asking because I need mRNA sequences and I don't know how to get them.

Thanks,

strand mRNA-sequence gff transcript-strand • 212 views
0
Entering edit mode

If a gene is on the positive strand the mRNA would have the same sequence to reference sequence (sans uracil). Instead of coding this from scratch the sequences can be retrieved programmatically from the command line or programming languages like R or Python.

0
Entering edit mode
11 weeks ago

The coordinates will be represented relative to the forward strand, the strand indicates the direction. Indeed for a feature on the reverse strand you would need to reverse complement.

To obtain sequences use a tool like bedtools getfasta:

bedtools getfasta [OPTIONS] -fi <fasta> -bed <bed/gff/vcf>

Options:
...
-s      Force strandedness. If the feature occupies the antisense,
strand, the sequence will be reverse complemented.
...


if the sequences span multiple exons you could use gffread:

Filter, convert or cluster GFF/GTF/BED records, extract the sequence of transcripts (exon or CDS) and more.