Question: Warning when using gff format in featureCounts - miRNAs
0
gravatar for MeiNB
20 months ago by
MeiNB10
Portugal
MeiNB10 wrote:

Hi, I have a problem running featureCounts to generate a count matrix for miRNA.

This is my featureCounts comand: featureCounts -F GFF -R -t "miRNA" -g ID -o output.counts -a miRNA-annotation.gff ccc_sorted.bam

With this comand I obtain:

========= Running =======

Load annotation file miRNA-annotation.gff ...
Features : 16770
Meta-features : 16726
Chromosomes/contigs : 43

Process BAM file ccc_sorted.bam...
Single-end reads are included.
Assign reads to features...
Total reads : 668600
Successfully assigned reads : 38347 (5.7%)
Running time : 0.02 minutes

==============================

This rate is to low! So I tried to put "=" in the comand: featureCounts -F GFF -R -t "miRNA" -g ID= -o output.counts -a miRNA-annotation.gff ccc_sorted.bam And with this comand I obtain:

=========== Running =============

Warning: failed to find the gene identifier attribute in the 9th column of the provided GTF file. The specified gene identifier attribute is 'ID=' The attributes included in your GTF annotation are 'ID=miR1120-3p-898.path1'

Load annotation file miRNA-annotation.gff ...
Features : 16770
Meta-features : 16726
Chromosomes/contigs : 43

Process BAM file ccc_sorted.bam...
Single-end reads are included.
Assign reads to features...
Total reads : 668600
Successfully assigned reads : 668600 (100.0%)
Running time : 0.02 minutes

=====================

I don't know what is the problem. I try to convert my gff file in gtf but I lost the information of miRNA.

Any help would be appreciated

raw counts mirna featurecounts • 1.3k views
ADD COMMENTlink modified 20 months ago • written 20 months ago by MeiNB10

Can you post a few lines from your GFF annotation file?

ADD REPLYlink written 20 months ago by genomax68k

GFF annotation file:

chr1A     data     miRNA   755777  755886  100     +       .       ID=miR1120-3p-898.path1;Name=miR1120-3p-898;Target=miR1120-3p-898 1 110;Gap=M110

chr1A     data     miRNA   755784  755886  100     +       .       ID=miR1120-3p-1946.path4;Name=miR1120-3p-1946;Target=miR1120-3p-1946 8 110;Gap=M103

chr1A     data    miRNA   1239102 1239256 100     -       .       ID=miR1125-5p-312.path1;Name=miR1125-5p-312;Target=miR1125-5p-312 1 155;Gap=M155

chr1A     data    miRNA   1736279 1736371 100     -       .       ID=miR1131-5p-73.path1;Name=miR1131-5p-73;Target=miR1131-5p-73 1 93;Gap=M93
ADD REPLYlink modified 20 months ago by genomax68k • written 20 months ago by MeiNB10

Please don't post additional information related to the question as an answer, please provide such info as comment or response to previous comments.

ADD REPLYlink written 20 months ago by Sej Modha4.2k

Have you tried using -g ID (or Name without the = sign)? You should also examine the alignments (using the annotation file) to see if the reads are aligning outside of the features you are interested in.

ADD REPLYlink modified 20 months ago • written 20 months ago by genomax68k

Yes, I tried -g ID and Name, and for both, I tried with = and without =. The results for ID and Name, are the same. Both without =, resulted in 5,7 % and both with = resulted in 100%. The problem with the = is the warning.

ADD REPLYlink written 20 months ago by MeiNB10

Also try using the -M option to count multi-mapping reads. Have you checked the alignments using a genome viewer?

ADD REPLYlink written 20 months ago by genomax68k

-M option isn't the solution, the results are the same. Nothing change.

Yes, everything is correct. This bam is a subset of my data, which contains only the reads that mapped in the regions annotated as miRNAs. So the 100% is normal and expected.

Maybe the problem is the gff.

ADD REPLYlink modified 20 months ago • written 20 months ago by MeiNB10

Are there counts in the output file you get or are all values 0?

ADD REPLYlink modified 20 months ago • written 20 months ago by genomax68k

In the output I get one line (ignoring the header), with all chromosomes and the count 668600.

The output summary, I get this: Status ccc_sorted.bam

Assigned 668600

Unassigned_Ambiguity 0

Unassigned_MultiMapping 0

Unassigned_NoFeatures 0

Unassigned_Unmapped 0

Unassigned_MappingQuality 0

Unassigned_FragmentLength 0

Unassigned_Chimera 0

Unassigned_Secondary 0

Unassigned_Nonjunction 0

Unassigned_Duplicate 0

ADD REPLYlink modified 20 months ago • written 20 months ago by MeiNB10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1254 users visited in the last hour