What is the best way to handle "?" in the 7th column (strand) of GTF files?
0
0
Entering edit mode
8 weeks ago
ki • 0

Hi all,

I'm working with a GTF file of new species where some entries have a "?" in the 7th column, which represents the strand. For example:

   NC_011033.1     RefSeq  transcript      11024   315294  .       ?       .       gene_id "OrsajM_p01"; transcript_id "unassigned_transcript_653"; db_xref "GeneID:6450162"; exception "trans-splicing, RNA editing"; gbkey "mRNA"; gene "nad1"; locus_tag "OrsajM_p01"; transcript_biotype "mRNA"; 

My questions:

What is the best practice for handling "?" in the strand column of GTF files?

Should I: Remove those entries but I am particularly interested in this gene

OR Replace "?" with a default value (like "+"),

Any advice or experience on this would be greatly appreciated.

Thanks!

K

fasta STAR ENSEMBL GTF RefSeq • 608 views
ADD COMMENT
2
Entering edit mode

I think it is allowed to use a dot ( . ) in that column as a sort of "unknown strand".

Personally I would replace the ? by a . thus.

ADD REPLY
0
Entering edit mode

. would indicate a feature that is not stranded. It would likely not be so (would that mean expression occurs on both strands).

ADD REPLY
1
Entering edit mode

Since you included STAR as a tag, you intend to use the file for RNAseq read counting?

Having a ? in that column is valid per GTF spec (strandedness relevant but unknown). You could see how your reads align in that region and their orientation. Then taking into account library type may need to change the strand value to + or -, in case reads there are not counted by default.

ADD REPLY
0
Entering edit mode

Thank you all for great suggestions.

Yes, I am using STAR for Alignment and after that salmon for counting, I usually remove "?" but this time this gene is imp for analysis and my library type is unstranded So, will it make any impact if I replace it with +/-/. ?

ADD REPLY
0
Entering edit mode

If your library is unstranded then replacing the ? with . may be a good place to start.

BTW: Have you tried to count with ? in the file, what happens in that case?

ADD REPLY

Login before adding your answer.

Traffic: 5668 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6