htseq-count produces no features amid alignment
Entering edit mode
7.4 years ago
bojingjia ▴ 10

I recently aligned some sequencing data using STAR against an mm10 prebuilt genome. Afterwards, I sorted and indexed using samtools, and proceeded to generate read counts using htseq-count (and appropriately, an mm10 Ensembl gtf file). But all of my read counts are peculiarly all 0's, classified as no features.

A number of other users have reported the same problem here on BioStars, but their concerns weren't resolved. I have to wonder if my reads failed to align, but a quick look at the bam files in IGV shows many aligned reads. Am I using a faulty mm10 annotation file? Would anyone have suggestions/comments?

htseq-count alignment RNA-Seq samtools • 2.7k views
Entering edit mode


A couple of checks -

  1. Default sorting order expected by HT-Seq in the BAM is name. Most aligners return coord. sorted BAM
  2. You aren't using a GTF file downloaded from UCSC. Last time I checked it had conflict in gene_id with transcript_id values. Read more in the FAQs at the end of this page.
  3. Chr name style is same in your BAM and GTF
Entering edit mode
7.4 years ago
ablanchetcohen ★ 1.2k

Do the sequence names in the BAM file and the GTF file match?

UCSC and Ensembl use different chromosome nomenclatures.

I will never understand why we can sequence the human genome, and put men on the moon, but not agree whether chromosome 1 should be referred to as chr1 or 1.


Login before adding your answer.

Traffic: 1048 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6