HTseq no features
Entering edit mode
11 weeks ago
cat910064 • 0

I have got some problem when analyzing my RNAseq data and I would like to seek for a help. Here is the brief description of my pipeline:

Obtained fasta file of 150 PE reads from Novaseq platform followed by non-stranded library prep I conducted fastQC and trimmomatic for all the files Since our sample is bacterial RNA, we don’t really need pair end for each sample, so I did only use R1 for all the downstream pipeline. I conduct Bowtie2 for mapping to my reference genome and I got >90 % reads mapped (my sample is mixed sample with some host tissue contamination) I plan to finish the reads count using HTseq:


htseq-count -f bam -s no -t gene -i ID <sample.bam> <annotation.gff> > output.txt


Here's what I got from output.txt:

no_feature 14756459 ambiguous 25200 __too_low_aQual 1902927 __not_aligned 3938933 __alignment_not_unique 0

most reads are “no_feature“ , I scan through each gene, the reads counts are all "0". I check my bam file using IGV, I can see the reads were in deeded mapped to the genome.

What could possibly go wrong? I've tried using default, -s <yes/no/reverse> but non of these improved. Why cannot my alignment file (bam and sam) recognize the corresponding features in gff file? (I've made sure the types matches (e.g. "ID" instead "gene_id").

below is the what my gff file looked like: gff file screen shot


bam GFF HTseq • 372 views
Entering edit mode

Well, the obvious...what is the name of the chromosome in your bam?

The other possible issue could be if FeatureCounts is looking for exons in your file, and nothing is marked as exonic

Entering edit mode

The htseq-count FAQ has a section about using GFF instead of GTF as a reference. In short, you need to adjust the --idattr option.


Login before adding your answer.

Traffic: 1526 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6