featureCounts --primary tag counting uniquely mapped reads and excludes multi-mapped reads?
1
1
Entering edit mode
4.2 years ago
solyris83 ▴ 20

Hi,

I am doing a featureCounts on 4 RNA-seq samples, code as found below. I am counting on the "gene" from gencode v33 as seen in the gtf file. Reason why I am doing this is to counter-check the counting process against the aligner's (STAR in this case) summary output.

 featureCounts -a gencode.v33.primary_assembly.annotation.gtf -o out.txt -O --fraction -B -t "gene" -s 2 -T 8 --primary -p 504-709_S21Aligned.sortedByCoord.out.bam 504-710_S22Aligned.sortedByCoord.out.bam 505-702_S26Aligned.sortedByCoord.out.bam 505-703_S27Aligned.sortedByCoord.out.bam

Am I right to assume that the above code will correspond to the uniquely mapped reads summary in STAR? The counting above is very close but not exactly same to the reported uniquely mapped reads from STAR, hence my assumption.

If so, what does the --primary tag do in this case, as I assume it will count the multi-mapped reads which is given a primary tag.

Regards Solyris

featureCounts STAR • 3.8k views
ADD COMMENT
1
Entering edit mode
4.2 years ago

featureCounts will not count multi-mappers by default. In your case, with "--primary" specified, it will count each multi-mapping read once (it counts the alignment marked as primary and ignores all the rest, which are marked as secondary). So your command as posted should differ from STAR's uniquely mapped reads slightly. For ignoring multi-mappers altogether, simply remove the "--primary" parameter.

You may also see slight differences in counting depending on how the different programs handle overlaps when counting, for example.

ADD COMMENT
1
Entering edit mode

This

with "--primary" specified, it will count each multi-mapping read once

is not entirely true (at least for featureCounts v2.0.0). featureCounts recognizes multimapped reads using the NH:i tag. If you use --primary without -M featureCounts will count only uniquely mapped reads (NH:i:1) and completely ignore multimapped reads regardless on the SAM primary tag. Using --primary in this scenario doesn't have any effect on counting whatsoever.

In case the aligner doesn't output the NH:i flag, featureCounts doesn't recognize unique/multimapped and will count all the alignments. In this case, using --primary will help you to count each read only once.

If you want to count both uniquely and multimapped reads (and your aligner outputs the NH:i flag) but each of the reads only once you can use --primary -M.

ADD REPLY

Login before adding your answer.

Traffic: 2004 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6