Question: Weird discrepancy of assigned reads in featureCounts when run on paired-end data
0
gravatar for Michael Dondrup
15 months ago by
Bergen, Norway
Michael Dondrup44k wrote:

I have noticed a weird discrepancy between the output of featureCounts when run in paired-end mode vs. single end mode on a paired-end sample.

When comparing the percentage of assigned features I get 45% assigned reads for single-end, which seems to be ok from other posts][1], vs. 5% for paired end mode, the only difference between the two count runs being parameter -p. I have seen a few posts like this one , reporting low assignment rate, and they mostly seem to be on paired end data, answers seem to suggest that something is wrong with the settings or protocol. However, what if that is not the case?

My data seems ok, with ~100% properly paired reads

Single end counting command:

featureCounts -T64  -s 1 -M -a /export/jonassenfs/michaeld/licebase/genomedata/lsal76ribo/lsalGM.gff3 -o featurecounts-test.txt -g Parent -t exon  -B -R 10_S48_R1_001.fastq.gzAligned.sortedByCoord.out.bam > featurecounts.log

Paired-end counting command on the same file yielding only 5% assigned reads:

featureCounts -T64 -p -s 1 -M -a /export/jonassenfs/michaeld/licebase/genomedata/lsal76ribo/lsalGM.gff3 -o featurecounts-test-p.txt -g Parent -t exon  -B -R 10_S48_R1_001.fastq.gzAligned.sortedByCoord.out.bam > featurecounts-p.log

Here is the output of samtools flagstat, showing 100% paired reads:

167713978 + 0 in total (QC-passed reads + QC-failed reads)
3459569 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
167713978 + 0 mapped (100.00% : N/A)
164254409 + 0 paired in sequencing
82127291 + 0 read1
82127118 + 0 read2
164254236 + 0 properly paired (100.00% : N/A)
164254236 + 0 with itself and mate mapped
173 + 0 singletons (0.00% : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

ADD COMMENTlink modified 15 months ago • written 15 months ago by Michael Dondrup44k
4
gravatar for Devon Ryan
15 months ago by
Devon Ryan84k
Freiburg, Germany
Devon Ryan84k wrote:

Do you really want -s 1? I would think -s 2 is more likely for recent data.

ADD COMMENTlink written 15 months ago by Devon Ryan84k
1

Ooh, they just told me it was 'strand-specific paired end', running featureCounts again....

Successfully assigned fragments : 73783106 (88.0%)

Thank you very much!

ADD REPLYlink modified 15 months ago • written 15 months ago by Michael Dondrup44k

Glad it was that easy :) Everything these days uses the "reverse" setting.

ADD REPLYlink written 15 months ago by Devon Ryan84k

Not quite everything. Lots of bacterial strand-specific RNA-seq is done using RNA ligation, and thus is -s 1.

But you are right that most human/mouse/etc samples seem to be using dNTP strategy recently.

ADD REPLYlink written 7 months ago by predeus630
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1794 users visited in the last hour