featureCounts fragment count
0
0
Entering edit mode
2.4 years ago

I can't figure out where the 17999315 fragments from featureCounts is from? Below is the output for the BAM file inputted to featureCounts.

samtools flagstat output for BAM input to featureCounts:

35483648 + 0 in total (QC-passed reads + QC-failed reads)
24895287 + 0 primary
10588361 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
0 + 0 primary duplicates
20155191 + 0 mapped (56.80% : N/A)
9566830 + 0 primary mapped (38.43% : N/A)
24895287 + 0 paired in sequencing
12334176 + 0 read1
12561111 + 0 read2
9539764 + 0 properly paired (38.32% : N/A)
9539764 + 0 with itself and mate mapped
27066 + 0 singletons (0.11% : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

featureCounts command (simplified):

featureCounts -pC -g gene_id -t exon -a annotation.gtf -s 0 -o counts.txt sample1.bam sample2.bam

featureCounts log (relevant parts):

//========================== featureCounts setting ===========================\\
||                                                                            ||
||                 Threads : 1                                                ||
||                   Level : meta-feature level                               ||
||              Paired-end : yes                                              ||
||         Strand specific : no                                               ||
||      Multimapping reads : not counted                                      ||
|| Multi-overlapping reads : not counted                                      ||
||   Min overlapping bases : 1                                                ||
||                                                                            ||
||          Chimeric reads : not counted                                      ||
||        Both ends mapped : not required                                     ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

//================================= Running ==================================\\
||                                                                            ||
|| Load annotation file /home/chan/mice_microgravity/genomes/transcripts/ ... ||
||    Features : 663359                                                       ||
||    Meta-features : 43432                                                   ||
||    Chromosomes/contigs : 45                                                ||
||                                                                            ||
|| Process BAM file /home/chan/mice_microgravity/outputs_coordinate/picar ... ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||
||                                                                            ||
||    WARNING: reads from the same pair were found not adjacent to each       ||
||             other in the input (due to read sorting by location or         ||
||             reporting of multi-mapping read pairs).                        ||
||                                                                            ||
||    Read re-ordering is performed.                                          ||
||                                                                            ||
||    Total fragments : 17999315                                              ||
||    Successfully assigned fragments : 2902259 (16.1%)                       ||
||    Running time : 1.30 minutes                                             ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

featureCounts output:

Assigned            2902259
Unassigned_Ambiguity        132370
Unassigned_MultiMapping     5895062
Unassigned_NoFeatures       1602957
Unassigned_Unmapped     7466667
Unassigned_MappingQuality   0
Unassigned_FragmentLength   0
Unassigned_Chimera      0
Unassigned_Secondary        0
Unassigned_Nonjunction      0
Unassigned_Duplicate        0

BAM file:

SRR13535276.10782785    163 chr1    3054936 1   5S107M  =   3054936 107 CTGTGTTAACTGCTGTCTATATTGAATGCTTCTGCTTTAGCCTTGTTGTATAAGACACAAAGAGATAATGCGTCAGAATCTGAAGATTGTTGGATGTGCTTTTCTGCTTCCC    AAFFAJJJJJJJJJJJJJJJJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ    PG:Z:MarkDuplicates NH:i:4  HI:i:4  nM:i:4  AS:i:204
SRR13535276.10782785    83  chr1    3054936 1   5S107M  =   3054936 -107    CTGTGTTAACTGCTGTCTATATTGAATGCTTCTGCTTTAGCCTTGTTGTATAAGACACAAAGAGATAATGCGTCAGAATCTGAAGATTGTTGGATGTGCTTTTCTGCTTCCC    JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFFFAA    PG:Z:MarkDuplicates NH:i:4  HI:i:4  nM:i:4  AS:i:204
SRR13535276.6423970 163 chr1    3257264 255 131M    =   3257264 131 GAGCTGGACTGGCATGGCATTTGAGGCTCTTCTTAGGTGTAACCTCCAGGATTGTAAAACAACAACAATGAAAGTAAATATACCTAACACACAACAGGAACTAATGTAGTGCAAAACACATTGACAAGAAT AAAFFFFJJFJFFJJ<FJJFFF-F-FJJFFJJJJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ PG:Z:MarkDuplicates NH:i:1  HI:i:1  nM:i:2  AS:i:256
SRR13535276.6423970 83  chr1    3257264 255 131M    =   3257264 -131    GAGCTGGACTGGCATGGCATTTGAGGCTCTTCTTAGGTGTAACCTCCAGGATTGTAAAACAACAACAATGAAAGTAAATATACCTAACACACAACAGGAACTAATGTAGTGCAAAACACATTGACAAGAAT JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJJJJJJJJJFJFFFAA PG:Z:MarkDuplicates NH:i:1  HI:i:1  nM:i:2  AS:i:256
SRR13535276.4876840 99  chr1    3257443 255 150M    =   3257448 155 CAATACAATAATGAAGAAGAAGAAGAAGATGATGATGATGATGATGGGGTGGTGCTGATGATGATGGTGATGACGGTGTCAACAACAATGATGACGACAACAACGATCAATAATCATCTAGGATAGAAGGACATTTCTTTACCACACTGA  AAAFFJJJJJFJJFJJJJJFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ  PG:Z:MarkDuplicates NH:i:1  HI:i:1  nM:i:2  AS:i:294
SRR13535276.4876840 147 chr1    3257448 255 150M    =   3257443 -155    CAATAATGAAGAAGAAGAAGAAGATGATGATGATGATGATGGGGTGGTGCTGATGATGATGGTGATGACGGTGTCAACAACAATGATGACGACAACAACGATCAATAATCATCTAGGATAGAAGGACATTTCTTTACCACACTGAAGGCT  JJJJJJJJJJJJJJJJJJJJJJJFJJJJJJJJJJJJJJJJJJJJJFJJJJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFFJJJJJJJJJJJJJJJJJJJJFJJJJJJJJJFFJJJJJJJJJJJJJJFJJJJJJJJJJJJJFFFAA  PG:Z:MarkDuplicates NH:i:1  HI:i:1  nM:i:2  AS:i:294
SRR13535276.5177230 163 chr1    3445501 255 14S32M  =   3630773 185370  TATCAACGCAGAGTACTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT  AAFFFJAFJFJAJ<FJJJJJJJJJJJJFJJJJJJJJJJJJJFJJJ<  PG:Z:MarkDuplicates NH:i:1  HI:i:1  nM:i:7  AS:i:112
SRR13535276.26623284    99  chr1    3496484 255 1S75M   =   3496484 75  TTTTTTACTTTTTTTTTTGTCTCCTTAGCACCATACAATAAGAAAGAGAGGTTTGATTAAAACATTTATACTAGTG    AAFFFFJJFJJJJJJJJJ<JJFFJJJJFJJJJAAJJJJJJJJJJJJJFJJJFJJJJJJJJJJJJJJJJJJFJJFJ7PG:Z:MarkDuplicates NH:i:1  HI:i:1  nM:i:8  AS:i:132
SRR13535276.26623284    147 chr1    3496484 255 1S75M   =   3496484 -75 TTTTTTACTTTTTTTTTTGTCTCCTTAGCACCATACAATAAGAAAGAGAGGTTTGATTAAAACATTTATACTAGTG    JJJJJJJFJFJJJJFF<JJFJJJJJJJJFJ7FJJJJJJJJJFJJJJJJJJJJJJJJJJJJJFJAJJJJJJJFAAAAPG:Z:MarkDuplicates NH:i:1  HI:i:1  nM:i:8  AS:i:132
SRR13535276.13699393    163 chr1    3534600 1   65M =   3534600 65  GCTGTGTATCAGTCCATGTGCTTCCGTGGGTCTTTGTTTTAAATATTGTCCTCAGTTTACAGAGT   AAFFFFJJJJJJJJJJJJJJJJJJJFJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ   PG:Z:MarkDuplicates NH:i:3  HI:i:1  nM:i:0  AS:i:128

Annotation GTF file:

chr1    ensGene transcript  3073253 3074322 .   +   .   gene_id "ENSMUSG00000102693"; transcript_id "ENSMUST00000193812";  gene_name "ENSMUSG00000102693";
chr1    ensGene exon    3073253 3074322 .   +   .   gene_id "ENSMUSG00000102693"; transcript_id "ENSMUST00000193812"; exon_number "1"; exon_id "ENSMUST00000193812.1"; gene_name "ENSMUSG00000102693";
chr1    ensGene transcript  3102016 3102125 .   +   .   gene_id "ENSMUSG00000064842"; transcript_id "ENSMUST00000082908";  gene_name "ENSMUSG00000064842";
chr1    ensGene exon    3102016 3102125 .   +   .   gene_id "ENSMUSG00000064842"; transcript_id "ENSMUST00000082908"; exon_number "1"; exon_id "ENSMUST00000082908.1"; gene_name "ENSMUSG00000064842";
chr1    ensGene transcript  3205901 3216344 .   -   .   gene_id "ENSMUSG00000051951"; transcript_id "ENSMUST00000162897";  gene_name "ENSMUSG00000051951";
chr1    ensGene exon    3205901 3207317 .   -   .   gene_id "ENSMUSG00000051951"; transcript_id "ENSMUST00000162897"; exon_number "1"; exon_id "ENSMUST00000162897.1"; gene_name "ENSMUSG00000051951";
chr1    ensGene exon    3213609 3216344 .   -   .   gene_id "ENSMUSG00000051951"; transcript_id "ENSMUST00000162897"; exon_number "2"; exon_id "ENSMUST00000162897.2"; gene_name "ENSMUSG00000051951";
chr1    ensGene transcript  3206523 3215632 .   -   .   gene_id "ENSMUSG00000051951"; transcript_id "ENSMUST00000159265";  gene_name "ENSMUSG00000051951";
chr1    ensGene exon    3206523 3207317 .   -   .   gene_id "ENSMUSG00000051951"; transcript_id "ENSMUST00000159265"; exon_number "1"; exon_id "ENSMUST00000159265.1"; gene_name "ENSMUSG00000051951";
chr1    ensGene exon    3213439 3215632 .   -   .   gene_id "ENSMUSG00000051951"; transcript_id "ENSMUST00000159265"; exon_number "2"; exon_id "ENSMUST00000159265.2"; gene_name "ENSMUSG00000051951";
featureCounts • 1.8k views
ADD COMMENT
2
Entering edit mode

how does your bed file look? how many genes/loci? Can you post the first few lines from your bed and bam file?

ADD REPLY
0
Entering edit mode

I updated my post above. I'm using a GTF file instead of BED. Thanks!

ADD REPLY
0
Entering edit mode

Can you show us your command line when running featureCounts? Total fragments should simply be the number of reads processed. When you run featureCounts in paired-end mode (using the -p argument) it counts fragments rather than reads, as stated in the documentation: If specified, fragments (or templates) will be counted instead of reads. This option is only applicable for paired-end reads; single-end reads are always counted as reads.

ADD REPLY
0
Entering edit mode

Just updated my post, I was using the -p option. Hmm, yeah I would think the total fragments would be equal to the mapped alignments / 2 since it's paired-end, but even the total alignments 35483648 / 2 = 17741824, which is less than featureCount's 17999315 total fragments...

Also, do you know if featureCounts only consider properly paired reads too when assigning features (i.e., 9539764 in the flagstat output)? Thanks!

ADD REPLY
0
Entering edit mode

Hmm, thinking about it more, I guess it could be that some of the secondary alignments aren't paired (not both mates mapped), so it could be more than 10588361 / 2 fragments contributing to the total fragment count?

If I were to report the results, if essentially featureCounts is not counting multimappers by default, assuming it's only looking at properly paired ones too, then would it be accurate to say it assigned 2902259 of the 9539764 properly paired reads (about 30.4%)?

ADD REPLY

Login before adding your answer.

Traffic: 1807 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6