Question: featurecounts more fragments than input read pairs
gravatar for ewre
2.2 years ago by
United States
ewre220 wrote:

Hi All, I have a question for the output of featurecounts (from subread package). The total number of my input read pairs is 47168870 (reported by fastQC and STAR):

>  Number of input reads |  47168870
>                       Average input read length | 152
>                                     UNIQUE READS:
>                    Uniquely mapped reads number | 37604677
>                         Uniquely mapped reads % | 79.72%
>                           Average mapped length | 149.39
>                        Number of splices: Total | 16519867
>             Number of splices: Annotated (sjdb) | 0

But the total number of fragments reported by featurecount is 82845035, almost twice as much as the number of input read pairs. the number of SAM alignment pairs reported by htseqcount is 81037190.

> (featurecount): Total fragments : 82845035                            
>   Successfully assigned fragments : 32386307 (39.1%)


> 81000000 SAM alignment record pairs processed. Warning: Mate pairing
> was ambiguous for 965089 records; mate key for first such record: 
> 81037190 SAM alignment pairs processed.

This is the gff I used for count is gencode.v24.annotation.gff3

My question is that I want to know what is the definition of fragment in featurecount report? why there is more fragments compared with the input read pairs? In my understanding, each read pair indicates a fragment and the total number of fragment and total number of read pair should be equal.

Thank you for your time in advance.


ADD COMMENTlink modified 2.0 years ago by Biostar ♦♦ 20 • written 2.2 years ago by ewre220

Did you use the -p option to count fragments instead of reads?

ADD REPLYlink written 2.2 years ago by genomax90k

Yes: featureCounts -p -s 2 -T 5 -a gencode.v24.annotation.gff3 -t exon -g gene_id -o sample.out sample.bam

ADD REPLYlink written 2.2 years ago by ewre220

Ewre, I have the same situation as you do. I wonder if you find out the answers to your questions, can you kindly share the answer?

ADD REPLYlink written 2.0 years ago by cyd0

If you isolate the read names of all the reads that have mapped, and then sort | uniq them, how many do you get? Chances that a multi-positional alignment is happening?

ADD REPLYlink written 2.0 years ago by Macspider3.1k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 706 users visited in the last hour