Question: featureCounts strand specificity
1
gravatar for tonja.r
2.1 years ago by
tonja.r400
UK
tonja.r400 wrote:

 

I was working with featureCounts and had three runs while changing only one parameter: the strand specificity.  I would expect that the reads from -s 1 and the reads from -s 2 would sum up to the the number of reads outputted by -s 0.
 

featureCounts --minReadOverlap 50 -s 0 -Q 1 -T 12 -a matrix_gene.saf -F SAF -O -o $OUT/total_counts.txt file.bam
featureCounts --minReadOverlap 50 -s 1 -Q 1 -T 12 -a matrix_gene.saf -F SAF -O -o $OUT/total_counts.txt file.bam
featureCounts --minReadOverlap 50 -s 2 -Q 1 -T 12 -a matrix_gene.saf -F SAF -O -o $OUT/total_counts.txt file.bam

However, the results differ:

 

==> minReadOverlap50_strand1/total_counts.txt.summary <==

Status    WEN1_6558.sort.bam    WNN1_6545.sort.bam WNN2_6550.sort.bam

Assigned    4643945    8863560    8859072

==> minReadOverlap50_strand2/total_counts.txt.summary <==

Status    WEN1_6558.sort.bam    WNN1_6545.sort.bam  WNN2_6550.sort.bam

Assigned    4775184    9123446    9158397

==> minReadOverlap50/total_counts.txt.summary <==

Status   WEN1_6558.sort.bam    WNN1_6545.sort.bam   WNN2_6550.sort.bam

Assigned    8356549    15881043    15933323

 


4643945+4775184=9419129 != 8356549
Why don't the reads sum up?

alignment • 2.5k views
ADD COMMENTlink modified 2.1 years ago by Carlo Yague3.3k • written 2.1 years ago by tonja.r400
2
gravatar for Carlo Yague
2.1 years ago by
Carlo Yague3.3k
Belgium
Carlo Yague3.3k wrote:

You misunderstood the role of the -s option :

-s <int>      Indicate if strand-specific read counting should be performed.
                  It has three possible values:  0 (unstranded), 1 (stranded) and
                  2 (reversely stranded). 0 by default.

When you choose either option 1 or 2, reads mapping on both - and + strands are taken into accounts. So it is not expected to sum up. The difference between -s1 and -s2 is that the strands are reversed in -s2. '-' becomes'+' and '+' becomes '-'  and as a consequence, '+' reads are assigned to '-' features.

ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by Carlo Yague3.3k

I have ChIP-seq data, so I have "+" and "-" reads. I have regions where I want to count my reads. Regions can overlap and can be on "+" and on "-" strands. I do not care about the strands, meaning if any read falls into a feature on "-" strand, it should be counted, if any read falls into a feature on "+" strand, it should be counted. Does it mean that I have to use -s0 and not care about strand specificity?

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by tonja.r400

Exactly ! -s1 or -s2 are mostly used with strand-specific RNA-seq data. With ChIP-seq data you'll usually use -s0.

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by Carlo Yague3.3k
1
gravatar for OD
2.1 years ago by
OD10
Germany
OD10 wrote:

Hi,

inspired by the post: A: featurecounts vs samtools view

Maybe your saf file contains overlapping features in different directions. Hence featureCounts will assign reads depending on the -s parameter either to the one or the other feature (i.e. counting the same read twice) but will assign reads only once to one feature in -s 0.

Could you maybe check for those occurrences in your saf file and run featureCounts on only this portion of the file for a test?

ADD COMMENTlink written 2.1 years ago by OD10

You can also try running featureCounts with the -O option. It will count reads overlapping features once by feature.

ADD REPLYlink written 2.1 years ago by Carlo Yague3.3k

I did it (in commands above I have -O option) and still something was wring there.

ADD REPLYlink written 2.1 years ago by tonja.r400

Yes, my bad, I now know what your issue is. Replying in "answers"

ADD REPLYlink written 2.1 years ago by Carlo Yague3.3k

As mentioned in the comment above, I can run the featureCount with -O parameter. And I did run it with this parameter (see commands), so I expected that with -s 0 the reads would be counted for each feature, even it they overlap. I also had a test example with two different features on negative and positive strand overlapping one read. With -O and -s 0 the read was counted twice: once for positive and once for negative strand

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by tonja.r400
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 720 users visited in the last hour