I am using
featureCounts to quantify my RNA-seq signal over gene regions. I am using it as follows:
For my single-end reads:
featureCounts -t exon -g gene_id -s 0 -O -T 10 -a <.gtf> -o <.counts> <.bam>
For my paired-end reads:
featureCounts -p -t exon -g gene_id -s 0 -O -T 10 -a <.gtf> -o <.counts> <.bam>
Where I am running into something confusing is with the
-s option. I have run these with all options for
-s, with differing proportions of 'successfully assigned reads'.
-s 0 (unstranded) and
-s 2 both assign about ~25% of the reads, however,
-s 1 assigns only ~3% of reads to gene regions. The fact that both 0 and 2 are producing similar proportions of assigned reads/fragments is interesting, and must say something about the library... but I need help determining exactly what that is.
Thanks in advance. Let me know if any additional information is needed.
Another important detail when counting paired end reads is that the meaning of
basically when read pairs are counted the counts will be halved
Your library is reverse stranded, so if you set it to stranded (
-s 1) you would be aligning most of your reads to the wrong strand relative to the gene coding direction. With unstranded (
-s 0) you align to both strands so you would capture any transcripts from that section of DNA regardless if the library was unstranded, stranded,or reverse stranded.