HTseq-count error
0
0
Entering edit mode
20 months ago

Hi, I am trying to count miRNA reads from a set of sam files. I am using HTseq-count to do so. I downloaded the mouse miRNA gff file from mirbase - mmu.gff. This is the command I use

htseq-count L23.unaligned.sam mmu.gff3 -t miRNA_primary_transcript

I get this error -

> Error processing GFF file (line 14 of file mmu.gff3):
Feature MI0021869 does not contain a 'gene_id' attribute
[Exception type: ValueError, raised in features.py:387]

Has anyone else faced this? Is this a problem with the mirbase gff file? Thanks.

mirbase HTseq-count gff • 552 views
ADD COMMENT
0
Entering edit mode

Based on the documentation it looks like htseq-count defaults to using the GFF gene_id attribute, but if you look at the content of mmu.gff3 none of the features contain a gene_id attribute, I assume this is intentional based on standards from mirBase set for annotating miRNAs.

I suggest first determining which feature annotations to use from mmu.gff3 for counting read overlap with miRNAs i.e. do you want to count overlaps with mature sequences or primary transcripts, then selecting for those from your gff file, then running htseq-count with -i <id attribute> where attribute would be ID, Alias, or Name based on what's defined in mmu.gff3.

ADD REPLY

Login before adding your answer.

Traffic: 1416 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6