Should you specify "-p" for paired end reads using featureCounts?
1
1
Entering edit mode
11 months ago
O.rka ▴ 710

I'm trying to understand whether or not I should be using the -p flag for featureCounts.

Here's the explanation:

  -p                  If specified, fragments (or templates) will be counted
                      instead of reads. This option is only applicable for
                      paired-end reads; single-end reads are always counted as
                      reads

Here's the output with and without the flag:

# Program:featureCounts v2.0.1; Command:"/expanse/projects/jcl110/anaconda3/envs/VEBA-assembly_env/bin/featureCounts" "-a" "veba_output/assembly/SRR5720299/intermediate/1__assembly/scaffolds.fasta.saf" "-o" "featurecounts.tsv" "-F" "SAF" "--tmpDir" "veba_output/assembly/SRR5720299/tmp/featurecounts" "-T" "1" "veba_output/assembly/SRR5720299/intermediate/2__alignment/mapped.sorted.bam"
Geneid  Chr Start   End Strand  Length  veba_output/assembly/SRR5720299/intermediate/2__alignment/mapped.sorted.bam
SRR5720299__NODE_1_length_71757_cov_11.4961 SRR5720299__NODE_1_length_71757_cov_11.4961 1   71757   +   71757   11519
SRR5720299__NODE_2_length_69346_cov_13.9814 SRR5720299__NODE_2_length_69346_cov_13.9814 1   69346   +   69346   13008
SRR5720299__NODE_3_length_55956_cov_13.3612 SRR5720299__NODE_3_length_55956_cov_13.3612 1   55956   +   55956   10725
SRR5720299__NODE_4_length_53799_cov_18.5165 SRR5720299__NODE_4_length_53799_cov_18.5165 1   53799   +   53799   12722
SRR5720299__NODE_5_length_48507_cov_9.1492  SRR5720299__NODE_5_length_48507_cov_9.1492  1   48507   +   48507   6543
SRR5720299__NODE_6_length_45224_cov_17.2711 SRR5720299__NODE_6_length_45224_cov_17.2711 1   45224   +   45224   10538
SRR5720299__NODE_7_length_41565_cov_12.9338 SRR5720299__NODE_7_length_41565_cov_12.9338 1   41565   +   41565   8035
SRR5720299__NODE_8_length_41525_cov_13.7932 SRR5720299__NODE_8_length_41525_cov_13.7932 1   41525   +   41525   7738
(base) [jespinoz@exp-15-06 TestVEBA]$
(base) [jespinoz@exp-15-06 TestVEBA]$ head featurecounts.paired.tsv
# Program:featureCounts v2.0.1; Command:"/expanse/projects/jcl110/anaconda3/envs/VEBA-assembly_env/bin/featureCounts" "-a" "veba_output/assembly/SRR5720299/intermediate/1__assembly/scaffolds.fasta.saf" "-o" "featurecounts.paired.tsv" "-F" "SAF" "--tmpDir" "veba_output/assembly/SRR5720299/tmp/featurecounts" "-T" "1" "-p" "veba_output/assembly/SRR5720299/intermediate/2__alignment/mapped.sorted.bam"
Geneid  Chr Start   End Strand  Length  veba_output/assembly/SRR5720299/intermediate/2__alignment/mapped.sorted.bam
SRR5720299__NODE_1_length_71757_cov_11.4961 SRR5720299__NODE_1_length_71757_cov_11.4961 1   71757   +   71757   5801
SRR5720299__NODE_2_length_69346_cov_13.9814 SRR5720299__NODE_2_length_69346_cov_13.9814 1   69346   +   69346   6573
SRR5720299__NODE_3_length_55956_cov_13.3612 SRR5720299__NODE_3_length_55956_cov_13.3612 1   55956   +   55956   5411
SRR5720299__NODE_4_length_53799_cov_18.5165 SRR5720299__NODE_4_length_53799_cov_18.5165 1   53799   +   53799   6386
SRR5720299__NODE_5_length_48507_cov_9.1492  SRR5720299__NODE_5_length_48507_cov_9.1492  1   48507   +   48507   3322
SRR5720299__NODE_6_length_45224_cov_17.2711 SRR5720299__NODE_6_length_45224_cov_17.2711 1   45224   +   45224   5330
SRR5720299__NODE_7_length_41565_cov_12.9338 SRR5720299__NODE_7_length_41565_cov_12.9338 1   41565   +   41565   4111
SRR5720299__NODE_8_length_41525_cov_13.7932 SRR5720299__NODE_8_length_41525_cov_13.7932 1   41525   +   41525   3892

Using paired-end method, it's ~2x without the paired-end method but not exact.

My questions:

  • Why isn't -p run 2X of a run without -p? Is it because only one read in the pair aligns in some cases?

  • Would it be a safe bet to just set -p for all my paired-end runs?

fastq rnaseq genomics • 1.2k views
ADD COMMENT
3
Entering edit mode
11 months ago
GenoMax 141k

-p now requires that you explicitly add --countReadPairs to avoid this ambiguity.

 -p                  Specify that input data contain paired-end reads. To
                      perform fragment counting (ie. counting read pairs), the
                      '--countReadPairs' parameter should also be specified in
                      addition to this parameter.

  --countReadPairs    Count read pairs (fragments) instead of reads. This option
                      is only applicable for paired-end reads.

Looks like you are not using a new enough version of subread, if you only have -p option.

That said you should set -p if you have paired-end data.

ADD COMMENT
0
Entering edit mode

the --countReadPairs applies only for new versions of featurcounts, in those cases -p has no effect on the counts that are reported

The OP must be using an older version, and the question is rather why isn't the number exactly 2x and whether they should be using -p at all.

  1. Always use paired counting for paired data, otherwise, the data would be artificially inflated to close to 2x, throwing off the statistical model
  2. It is not exactly 2x for the reasons you mention, it is not obvious how to count an orphan read, or other situations where the pair seems invalid
ADD REPLY
0
Entering edit mode

Based on the names this data looks to be some kind of assembly. We have not idea how that was made and what aligner is being used to align the reads so some of the odd behavior seen may be a result of the data in case.

ADD REPLY
0
Entering edit mode

So just to be clear, I should use both -p and --countReadPairs for my paired reads? Also, if I happened to use single ended reads (e.g., ONT reads) and the above options were on, would it throw an error or just ignore?

ADD REPLY
0
Entering edit mode

I should use both -p and --countReadPairs for my paired reads

As long as you have upgraded to a recent version that has these options.

There are no paired-end reads in case of ONT so these options do not apply.

ADD REPLY
0
Entering edit mode

Ok cool I just tested it. You can only use the above with paired end as it fails with single ended long reads. So to be clear, if you are using ONT reads then don't do -p and --countReadPairs.

ADD REPLY

Login before adding your answer.

Traffic: 2630 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6