Question: Difference in the counts generated using FeatureCounts and HTseq
0
gravatar for bioinformatics.queries
4 months ago by
bioinformatics.queries50 wrote:

Hi Everyone

I have question related to the counts generated by software using FeatureCounts and Htseq. They both give very different results. For example

FeatureCounts

AluSx3       295781
MER5A        244353
MER41B        43925
MIR         1513933

HTSeq

AluSx3        88023
MER5A        111860
MER41B        18632
MIR           67211

What could be the reason for large difference and which tool we should take into account for our analysis. Currently I was trying to generate the counts for transposons.

Thanks

sequencing • 225 views
ADD COMMENTlink modified 4 months ago by h.mon32k • written 4 months ago by bioinformatics.queries50

Why you've used the "Tool" tag for a question? You must include the syntax used for htseq-count and featurecounts.

ADD REPLYlink written 4 months ago by Shred280

Thank you so much for your response. I used the following command. I used the default parameter for both the tools.

FeatureCounts

featureCounts -T 10 -a $GFF/hg19_rmsk_TE.gtf -o TE_featurecounts/${file}_featureCounts.txt $file.sorted.bam

Htseq count

htseq-count -f bam $file.sorted.bam $GFF/hg19_rmsk_TE.gtf > TE_counts/$file.count.out

Could you please suggest what to be don? Am I required to change any parameter

ADD REPLYlink modified 4 months ago • written 4 months ago by bioinformatics.queries50
0
gravatar for dariober
4 months ago by
dariober11k
WCIP | Glasgow | UK
dariober11k wrote:

Can you post the commands you used for both featureCounts and htseq and the summary statistics they produce?

For one thing, the two programs have different defaults for filtering reads on mapping quality.

featureCounts v1.6.4:

  -Q <int>            The minimum mapping quality score a read must satisfy in
                      order to be counted. For paired-end reads, at least one
                      end should satisfy this criteria. 0 by default.

htseq

-a <minaqual>, --a=<minaqual>
Skip all reads with MAPQ alignment quality lower than the given minimum value (default: 10). MAPQ is the 5th column of a SAM/BAM file and its usage depends on the software used to map the reads.

Once I compared to two and found negligible differences but I can't remember exactly how I made the comparison.

ADD COMMENTlink modified 4 months ago • written 4 months ago by dariober11k

Thank you so much for your response. I used the following command. I used the default parameter for both the tools.

FeatureCounts

featureCounts -T 10 -a $GFF/hg19_rmsk_TE.gtf -o TE_featurecounts/${file}_featureCounts.txt $file.sorted.bam

Htseq count

htseq-count -f bam $file.sorted.bam $GFF/hg19_rmsk_TE.gtf > TE_counts/$file.count.out

Could you please suggest what to be don? Am I required to change any parameter

Htseq

ADD REPLYlink modified 4 months ago • written 4 months ago by bioinformatics.queries50

As I say in my answer, the defaults for mapping quality are different and featureCounts does not filter for mapping quality by default. Since you mention transposones, it may well be that you have lots of reads with MAPQ 0 hence more counts with featureCounts.

ADD REPLYlink written 4 months ago by dariober11k

So what filter do you suggest to apply for transposon. What should we set the parameter for -Q in featureCounts ?

ADD REPLYlink written 4 months ago by bioinformatics.queries50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2724 users visited in the last hour
_