Multibamcov Or Htseq-Count To Count Read Per Feature ?
2
1
Entering edit mode
9.1 years ago

Hi,

I'm wondering what is the best method to extract the number of reads for each feature in a gtf (or gff, bed,...) file. I tried htseq-count and multiBamCov but they gave me different results.it seems that multiBam count all the reads (complete and partial aligned) associated with each exon. It means there are many reads are count twice or more time.

After doing DE analysis (DESeq) on both read count matrix (one from htseq, one from multiBamCov), the results are quite surprising.

pval adjusted < 0.05 multiBamCov : 123 gene differentially expressed htseq : 880 gene

Intersection : 118 gene

So which one to use ? is it possible to change multiBamCov to be more strict ? maybe is it possible to use other tools from bedtools ?

Thanks,

N.

htseq read counts • 3.8k views
ADD COMMENT
0
Entering edit mode
9.1 years ago

edit >nothing..

ADD COMMENT
0
Entering edit mode

Give people some time... It's been less than a day since your question was posted, people on the planet are working in different time zones.

ADD REPLY
0
Entering edit mode

Sorry to be too hasty

ADD REPLY
0
Entering edit mode
9.1 years ago
Ryan Dale 4.9k

htseq-count tends to be more selective about what reads will be counted. For example, it won't count ambiguous reads according to rules detailed at http://www-huber.embl.de/users/anders/HTSeq/doc/count.html. htseq-count also does not count multimappers (reads with BAM flag 0x100). Depending on your data, these differences could greatly influence the final results.

ADD COMMENT

Login before adding your answer.

Traffic: 2214 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6