Entering edit mode
3.2 years ago
sanatbhadsavle
▴
10
Hi, I am trying to count miRNA reads from a set of sam files. I am using HTseq-count to do so. I downloaded the mouse miRNA gff file from mirbase - mmu.gff. This is the command I use
htseq-count L23.unaligned.sam mmu.gff3 -t miRNA_primary_transcript
I get this error -
> Error processing GFF file (line 14 of file mmu.gff3):
Feature MI0021869 does not contain a 'gene_id' attribute
[Exception type: ValueError, raised in features.py:387]
Has anyone else faced this? Is this a problem with the mirbase gff file? Thanks.
Based on the documentation it looks like
htseq-countdefaults to using the GFFgene_idattribute, but if you look at the content ofmmu.gff3none of the features contain agene_idattribute, I assume this is intentional based on standards from mirBase set for annotating miRNAs.I suggest first determining which feature annotations to use from
mmu.gff3for counting read overlap with miRNAs i.e. do you want to count overlaps with mature sequences or primary transcripts, then selecting for those from your gff file, then runninghtseq-countwith-i <id attribute>where attribute would beID,Alias, orNamebased on what's defined inmmu.gff3.