Read counts based on genes from samtools, feature count and htseq-count
1
1
Entering edit mode
4.9 years ago

Hi All,

I am currently trying to find number of reads (sequenced from DNAseq) that are mapped to human genes. So I tried three different methods to validate whether I am getting the same read count. 1) samtools 2) featurecount 3) htseq-count

For example, considering this gene "WASH7p". As per the human GTF downloaded from UCSC for hg19, I have following exons in them. enter image description here

Method1: using samtools command enter image description here

Total read count as per samtools: 6

2) featurecount: enter image description here

3) htseq-count

htseq-count -f bam -s no -i gene_id Sample_001_TGACCA_L001.bam Homo_sapiens/UCSC/hg19/Annotation/Genes/genes.gtf > Sample_001_Read_count_based_on_gene.txt 2>Sample_001_htseq_count.log &

WASH7P 0

Why htseq-count is giving "0"? Am I missing some parameters?

DNAseq featurecount samtools htseqcount • 2.5k views
ADD COMMENT
0
Entering edit mode

Hi, Can you post the last lines of the Htseq output? There you'll find the number of reads which are ambiguous or land in un-annotated regions. Also the multi-mappings are listed there. If your reads map more than once, Htseq will not assign these to the genes.

ADD REPLY
0
Entering edit mode

Dear Michael,

Please find the requested information.

__no_feature 1234324

__ambiguous 123995

__too_low_aQual 347284

__not_aligned 2851

__alignment_not_unique 0

ADD REPLY
0
Entering edit mode
4.9 years ago

If you lower the value of -a in htseq-count it'll probably produce identical results to featureCounts (though it'll take longer to run).

ADD COMMENT

Login before adding your answer.

Traffic: 2729 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6