Negative value in "phase" line of a gff3 file.What does it mean?
0
0
Entering edit mode
6.3 years ago
aramis.1994 ▴ 10

Hi you all,

I am trying to perform featureCount over miRNA reads from CHO cells for differential expression analysis purposes. I have created my own annotation file by mapping hairpin miRNA sequences downloaded from mirBase (http://www.mirbase.org/cgi-bin/mirna_summary.pl?org=cgr) against ENSEMBL reference genome for CHO cell. For that purpose, I have used gmap -D:

gmap -D ~/miRNA/crigri_gmap -d crigri_gmap -f 2 -n 0 -t 16 --gff3-cds=genomic hairpin_crigri_dna_mod.fa > trial_1.gff3

I retrieve a .gff file with 9 columns containing the 9 expected fields (seqid,source,type,start,end,score,strand,phase and attributes), but for some scaffolds, I get a "-1" value in the "phase" field. According to ENSEMBLE (https://www.ensembl.org/info/website/upload/gff3.html) 'One of '0', '1' or '2'. '0' indicates that the first base of the feature is the first base of a codon, '1' that the second base is the first base of a codon, and so on.' So I don't know how to interpret this value.

This is how the head of my .gff3 file looks:

scaffold_6      crigri_gmap     gene    62246878        62246977        .       +       .       ID=cgr_let_7a_MI0020368.path1;Name=cgr_let_7a_MI0020368

scaffold_6      crigri_gmap     mRNA    62246878        62246977        .       +       .       ID=cgr_let_7a_MI0020368.mrna1;Name=cgr_let_7a_MI0020368;Parent=cgr_let_7a_MI0020368.path1;coverage=100.0;identity=100.0;matches=100;mismatches=0;indels=0;unknowns=0

scaffold_6      crigri_gmap     exon    1       62246878        100     +       .       ID=cgr_let_7a_MI0020368.mrna1.exon1;Name=cgr_let_7a_MI0020368;Parent=cgr_let_7a_MI0020368.mrna1;Target=cgr_let_7a_MI0020368 1 1 +

scaffold_6      crigri_gmap     CDS     62246878        62246975        100     +       -1      ID=cgr_let_7a_MI0020368.mrna1.cds1;Name=cgr_let_7a_MI0020368;Parent=cgr_let_7a_MI0020368.mrna1;Target=cgr_let_7a_MI0020368 1 98 +

Any help with this will be much appreciated. Thank you very much!!

rna-seq .gff file RNA-Seq • 1.9k views
ADD COMMENT
0
Entering edit mode

This is strange, no idea why there is negative value... But to fix the phases you can use agat_sp_fix_cds_phases.pl from AGAT

ADD REPLY

Login before adding your answer.

Traffic: 1926 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6