4.0 years ago
oma219 ▴ 30

I had question about my gene counting results, i was using featureCounts and htseq counts and I was comparing the output of the two. It seems like they agree nicely however the counts themselves are the same just their ratios. It appears that featureCounts counts twice as many for each gene that htseq does so I was wondering if that was common for other people and if there was a reason for it? Thanks.

Do you have paired end data? Possibly featureCount counted both reads per pair?

Exactly. featureCounts needs an extra -p flag.

Yeah that helped bring down some of the count. For htseq counts, is there a similar option because I couldn't find anything relating to paired end when I looked at the options? I think my pairs aren't adjacent because so I'm not if thats what is causing the problem.

If you want to know which one is more accurate, look up a gene (preferably with low counts, say ~10) and check in IGV how many reads you can count manually.

probably because featurecounts is including multi-hit sequences? but it thought that htseq did the same. Tell us if you find the answer.

probably because featurecounts is including multi-hit sequences?

By default it doesn't.

Do also let us know how you handled the multi-mappers when you did the actual mapping.

4.0 years ago
oma219 ▴ 30

I added the -p flag and it seems to help with bringing down some of the counts so they are similar. However for some of the genes that are expressed less frequently, featureCounts clearly produces higher counts like 2 vs. 100. Has anyone else run into that problem because for some of the heavily expressed genes, the counts are almost identical but when we get down to the less expressed ones, it more variable?

