featurecounts vs samtools view
1
0
Entering edit mode
8.5 years ago
tonja.r ▴ 600

I needed to check the results coming from featureCounts from subread. In fact, samtools outputted less reads then featurecounts did.

I run samtools as follows:

samtools view -c -q 1 -F 4 -L matrix_gene.bed file.bam
2732259

And featureCounts:

featureCounts --minReadOverlap 50 -s 1 -Q 1 -T 12 -a matrix_gene.saf -F SAF -O -o $OUT/total_counts.txt file.bam
featureCounts --minReadOverlap 50 -s 2 -Q 1 -T 12 -a matrix_gene.saf -F SAF -O -o $OUT/total_counts.txt file.bam

(minReadOverlap = 50 because the length of my reads is 50, -s 1 for stranded and -s 2 for reversely stranded, -Q mapping quality and -O for multiple overlap (as samtools outputs alignment multiple times if they overlap multiple regions)

samtools results: 2732259

featureCounts -s 1: 1661976
featureCounts -s 2: 1662917

So, the number of stranded and reverse stranded reads: 3324893

Why do the results differ if I specified the same parameters in both methods?

PS: I did not find what overlap between a read and a region samtools allows, but if the overlap less than the read length, samtools should have more reads than featureCounts.

software-error • 3.2k views
ADD COMMENT
1
Entering edit mode
8.5 years ago

The answer to this type of questions usually very simple - the two tools actually do different things. So the discrepancy is unlikely to be caused by software errors.

Of course why do they do different things - now that is harder to answer. Make sure to investigate the preconceptions and tacit assumptions that you have about the parameters and what the tools actually they do.

For example a read may overlap with features on both positive and negative strands - in which case it will be double counted. Etc.

ADD COMMENT

Login before adding your answer.

Traffic: 2748 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6